Building a dynamic and responsive Pinterest

Pinterest Engineering
11 min readintermediate
--
View Original

Overview

The article discusses the evolution of Pinterest's backend architecture from a static content serving model to a dynamic and responsive system. It highlights the challenges faced in real-time data processing and the development of nine different systems that power features like Following Feed, Interest Feed, and recommendations.

What You'll Learn

1

How to implement real-time data updates in a web-scale application

2

Why separating computation from storage improves system performance

3

How to optimize latency for machine learning ranking systems

Prerequisites & Requirements

  • Understanding of backend architecture and real-time data processing
  • Familiarity with C++ and RocksDB(optional)

Key Questions Answered

What were the main challenges in transitioning to a dynamic Pinterest?
The main challenges included the need for real-time updates of user data and content, the requirement for a high-performance machine learning ranking system, and the necessity for a candidate generation system for content recommendations. These challenges stemmed from the limitations of the previous static architecture.
How does the current Following Feed architecture differ from the previous version?
The current Following Feed architecture utilizes real-time updates and low-latency queries through systems like Apiary and Polaris, which replaced the older method of pre-generating content stored in HBase. This allows for more dynamic content delivery based on real-time user interactions.
What is the role of Scorpion in Pinterest's architecture?
Scorpion is a unified machine learning online ranking platform that powers most of Pinterest's ML models in production. It aggressively caches static feature data in local memory to achieve a high cache hit rate, which is crucial for maintaining low latency in content ranking.
How does Pinterest ensure low latency for its machine learning systems?
Pinterest ensures low latency by optimizing its systems, such as Scorpion, to achieve high concurrency and low context switching. Techniques like zero-copy data feeding into ML models and careful tuning of threading models are employed to meet stringent latency requirements.

Key Statistics & Figures

Monthly active users in 2015
100 million
This was the user base size when Pinterest transitioned from a static to a dynamic content serving model.
P99 latency requirement for ML ranking system
small dozens of ms
This latency requirement is crucial for ensuring a responsive user experience during content ranking.
Cache hit rate for Scorpion
over 90%
This high cache hit rate is essential for maintaining low latency in the ranking process.

Technologies & Tools

Some links below are affiliate links. We may earn a commission if you make a purchase.

Key Actionable Insights

1
Implement real-time data processing to enhance user experience in applications.
By transitioning from static to dynamic content delivery, applications can provide more relevant and timely information to users, which can significantly improve engagement and satisfaction.
2
Optimize machine learning systems for low latency by caching frequently accessed data.
Using in-memory caching strategies can drastically reduce response times for ML models, allowing for real-time scoring and recommendations that enhance user interaction.
3
Separate computation from storage to improve system scalability and performance.
This architectural decision allows for more efficient resource utilization and easier maintenance, especially as the system grows and requires more complex data processing.

Common Pitfalls

1
Co-locating computation and storage can lead to operational challenges.
This issue arises as the complexity of feature data increases, making it difficult to manage and scale the system effectively. Separating these concerns can alleviate many of these difficulties.

Related Concepts

Real-time Data Processing
Machine Learning Optimization
Distributed Systems Architecture