Overview
The article discusses the establishment of a large-scale learned retrieval system at Pinterest, focusing on the transition from heuristic-based methods to an embedding-based retrieval system. It highlights the architecture, methods, and results of implementing a two-tower model for candidate generation and ranking, as well as the importance of auto-retraining in maintaining model performance.
What You'll Learn
1
How to implement a two-tower model for learned retrieval in a recommendation system
2
Why auto-retraining is essential for maintaining model accuracy in real-time systems
3
How to leverage embedding-based retrieval for improved candidate generation
Prerequisites & Requirements
- Understanding of machine learning concepts and recommendation systems
- Experience with building and deploying machine learning models(optional)
Key Questions Answered
What is the architecture of the learned retrieval system at Pinterest?
The learned retrieval system at Pinterest utilizes a two-tower model architecture where one tower learns query embeddings and the other learns item embeddings. This setup allows for efficient online serving through nearest neighbor search, optimizing the retrieval process from a vast candidate pool.
How does Pinterest ensure model version synchronization during auto-retraining?
Pinterest maintains model version synchronization by attaching model version metadata to each ANN search service host. This metadata maps model names to their latest versions, ensuring that user embeddings are computed using the correct model version, even during index rollouts.
What are the benefits of using an embedding-based retrieval system?
Embedding-based retrieval systems improve candidate generation by leveraging user engagement data to create more relevant content recommendations. This approach has led to higher user engagement and the deprecation of less effective candidate generators, enhancing overall site performance.
What challenges does Pinterest face in deploying its learned retrieval system?
Pinterest faces challenges in ensuring that the two-tower models are synchronized during deployment, as the indexing pipeline takes longer than the viewer model readiness. This can lead to candidate quality drops if not managed properly, necessitating careful version control and rollback strategies.
Key Statistics & Figures
Monthly Active Users (MAUs)
500 million
Pinterest serves over 500 million MAUs, necessitating a robust and scalable retrieval system.
Candidate Generators in Production
20
The homefeed system utilizes over 20 candidate generators with various retrieval strategies to cater to different user engagement scenarios.
Technologies & Tools
Backend
Ann
Used for approximate nearest neighbor search in the retrieval system.
Machine Learning
Transformer
Employed in the ranking model to capture user engagement patterns.
Key Actionable Insights
1Implement a two-tower model architecture for your recommendation systems to enhance retrieval efficiency.This approach allows for effective separation of query and item embeddings, improving the accuracy of recommendations and user engagement.
2Establish an auto-retraining workflow to keep your models updated with the latest user trends.Regularly retraining models ensures that they adapt to changing user behaviors and preferences, maintaining their relevance and effectiveness.
3Utilize embedding-based retrieval to streamline candidate generation from large datasets.This method leverages user engagement data, leading to more personalized recommendations and improved user satisfaction.
Common Pitfalls
1
Failing to synchronize model versions during deployment can lead to degraded candidate quality.
This issue arises because the indexing pipeline may take longer to update than the viewer model, resulting in mismatched embeddings if not carefully managed.
Related Concepts
Machine Learning
Recommendation Systems
Auto-retraining
Embedding Techniques