toplogo
登入

Unified Embedding-Based Personalized Product Retrieval in Etsy Search


核心概念
A novel two-tower model with a unified embedding-based product encoder and joint user-query encoder to handle diverse search queries and provide personalized product recommendations on an e-commerce platform.
摘要

The paper presents a unified embedding-based personalized product retrieval system called UEPPR, which addresses the challenges of vocabulary gap and personalization in e-commerce search.

Key highlights:

  • The product encoder is a unified model that captures various complementary aspects of a product, including transformer-based representations, bipartite graph embeddings, and term-based embeddings.
  • The joint query-user encoder incorporates query text, user location, and user engagement history to provide personalized recommendations.
  • The authors employ novel training strategies, including hard negative mining, pre-training of language models, and a multi-part hinge loss function to optimize the model.
  • An ANN-based product boosting approach is used to balance product relevance and quality during candidate retrieval.
  • Extensive offline evaluations and online A/B testing demonstrate significant improvements in search purchase rate and site-wide conversion rate.

The paper provides a comprehensive overview of the system design, feature engineering, and deployment challenges, offering valuable insights for building effective personalized retrieval systems in e-commerce.

edit_icon

客製化摘要

edit_icon

使用 AI 重寫

edit_icon

產生引用格式

translate_icon

翻譯原文

visual_icon

產生心智圖

visit_icon

前往原文

統計資料
The dataset contains up to 30 million users and more than 150 million interactions. The 98th percentile distance to user for candidates in the location-enabled model is 2600 miles, compared to 4200 miles in the baseline. The personalized UEPPR variants had a greater impact on the signed-in and habitual buyer segments, improving site-wide conversion rate by 2.63% and organic search purchase rate by 5.58%.
引述
"Our personalized retrieval model significantly improves the overall search experience, as measured by a 5.58% increase in search purchase rate and a 2.63% increase in site-wide conversion rate, aggregated across multiple A/B tests - on live traffic." "We employ black box optimization to identify globally optimal quality weights which maximize desired target metrics with realistic serving constraints."

從以下內容提煉的關鍵洞見

by Rishikesh Jh... arxiv.org 09-26-2024

https://arxiv.org/pdf/2306.04833.pdf
Unified Embedding Based Personalized Retrieval in Etsy Search

深入探究

How can the proposed approach be extended to handle dynamic user preferences and evolving product catalogs in e-commerce search?

The proposed Unified Embedding Based Personalized Retrieval (UEPPR) system can be extended to accommodate dynamic user preferences and evolving product catalogs through several strategies. First, implementing a continuous learning framework would allow the model to adapt to changes in user behavior and product offerings in real-time. This could involve periodically retraining the model on fresh data, including recent user interactions and newly added products, to ensure that the embeddings remain relevant. Second, incorporating user feedback mechanisms can enhance personalization. For instance, allowing users to provide explicit feedback on search results (e.g., likes, dislikes, or relevance ratings) can help refine the embeddings and improve the model's understanding of user preferences. This feedback can be integrated into the training data, enabling the model to learn from user interactions more effectively. Third, leveraging contextual signals such as time of day, seasonality, and trending products can help the model adjust to evolving user preferences. For example, during holiday seasons, the model could prioritize products that are popular for gifting, while in other periods, it could focus on items that align with current trends. Lastly, employing a multi-faceted user profile that captures various dimensions of user behavior—such as browsing history, purchase history, and social interactions—can provide a more comprehensive view of user preferences. This holistic approach can enhance the model's ability to deliver personalized search results that resonate with users' current interests and needs.

What are the potential challenges and trade-offs in incorporating more complex user behavior signals, such as browsing patterns and social interactions, into the personalization model?

Incorporating more complex user behavior signals, such as browsing patterns and social interactions, into the personalization model presents several challenges and trade-offs. One significant challenge is the increased computational complexity. More intricate models that account for diverse user signals require additional processing power and memory, which can lead to higher latency in real-time search scenarios. This is particularly critical in e-commerce, where low-latency responses are essential for maintaining a positive user experience. Another challenge is the potential for overfitting. As the model becomes more complex, there is a risk that it may learn noise in the data rather than meaningful patterns. This can result in poor generalization to new users or products, ultimately degrading the search experience. To mitigate this, careful feature selection and regularization techniques must be employed to ensure that the model remains robust. Additionally, privacy concerns arise when handling sensitive user data, such as browsing history and social interactions. Ensuring compliance with data protection regulations (e.g., GDPR) while still providing personalized experiences can be a delicate balance. Organizations must implement transparent data usage policies and allow users to control their data preferences. Finally, the trade-off between personalization and diversity in search results must be considered. While personalized results can enhance user satisfaction, they may also lead to a filter bubble effect, where users are only exposed to a narrow range of products. To address this, the model should incorporate mechanisms that promote diversity in search results, ensuring that users encounter a broader array of options.

Could the unified embedding approach be applied to other domains beyond e-commerce, such as scholarly search or enterprise search, and what adaptations would be required?

The unified embedding approach, as demonstrated in the UEPPR system, can indeed be applied to other domains beyond e-commerce, such as scholarly search or enterprise search. However, several adaptations would be necessary to tailor the model to the specific characteristics and requirements of these domains. In scholarly search, the model would need to incorporate domain-specific features, such as citation counts, publication dates, and author impact factors, to enhance the relevance of search results. Additionally, the embeddings would need to capture the semantic relationships between research topics, methodologies, and findings, which may require the integration of specialized knowledge graphs or ontologies. For enterprise search, the model should account for organizational context, such as user roles, project affiliations, and internal document hierarchies. This could involve creating user profiles that reflect employees' responsibilities and interests within the organization. Furthermore, the model would need to handle a diverse range of document types, including reports, presentations, and emails, necessitating the development of tailored embeddings for each document type. Moreover, both domains would benefit from incorporating collaborative filtering techniques to leverage user interactions and preferences effectively. For instance, in scholarly search, the model could analyze co-authorship patterns and citation networks to enhance the retrieval of relevant literature. In enterprise search, understanding team dynamics and project collaborations could inform the personalization of search results. Overall, while the unified embedding approach is versatile and applicable across various domains, careful consideration of domain-specific features, user contexts, and data types is essential for optimizing its effectiveness in non-e-commerce settings.
0
star