Delivering Faster Analytics at Pinterest

Pinterest Engineering
6 min readadvanced
--
View Original

Overview

The article discusses Pinterest's migration from Druid to StarRocks for delivering faster analytics. It highlights the challenges faced, the requirements for the new system, and the significant performance improvements achieved post-migration.

What You'll Learn

1

How to migrate an analytics platform from Druid to StarRocks

2

Why real-time insights are crucial for advertising performance

3

How to optimize data ingestion processes using SQL in StarRocks

4

When to consider switching OLAP databases based on scalability needs

Prerequisites & Requirements

  • Understanding of OLAP databases and SQL
  • Familiarity with data analytics tools and frameworks(optional)

Key Questions Answered

What were the main challenges in offering Partner Insights at Pinterest?
The challenges included managing a massive number of advertisers with unique needs and metrics, requiring real-time aggregation of multi-dimensional data points. This complexity necessitated a database solution capable of handling high volumes of complex queries efficiently.
What improvements were observed after migrating to StarRocks?
Post-migration, Pinterest reported a 50% reduction in p90 latency and a three-fold increase in cost-performance efficiency. The data ingestion process was streamlined, achieving a data freshness of just 10 seconds.
What are the key features of StarRocks that benefited Pinterest?
StarRocks offers a standard SQL interface, supports complex queries like joins and sub-queries, and has native ingestion capabilities without external dependencies. These features significantly improved performance and usability for Pinterest's analytics needs.
How does Pinterest ensure real-time insights for advertisers?
Pinterest uses the Partner Insights tool, which provides customizable dashboards that allow advertisers to access real-time performance metrics. This enables marketers to make quick, data-driven adjustments to their advertising strategies.

Key Statistics & Figures

p90 latency reduction
50%
Achieved after migrating to StarRocks, significantly improving performance.
Cost-performance efficiency increase
3-fold
Realized post-migration with reduced instances required.
Data freshness
10 seconds
Achieved through streamlined data ingestion processes.

Technologies & Tools

Database
Starrocks
Used for real-time OLAP analytics and data ingestion.
Service
Archmage
Provides a uniform interface over different analytical storage systems and manages complexities of deployment.

Key Actionable Insights

1
Consider migrating to a more efficient OLAP database like StarRocks if your current setup struggles with scalability and performance.
As Pinterest experienced, moving to StarRocks resulted in significant performance improvements and cost efficiency, making it a viable option for organizations facing similar challenges.
2
Utilize SQL for data ingestion processes to streamline onboarding and reduce complexity.
By eliminating JSON configs and leveraging SQL ingestion, Pinterest simplified its data processes, which can save time and resources during implementation.
3
Implement connection pooling in your applications to enhance performance and reduce latency.
Pinterest's use of connection pooling in Archmage saved an average of 50 ms per JDBC connection, demonstrating the importance of optimizing database connections for better application responsiveness.

Common Pitfalls

1
Failing to account for the complexity of multi-dimensional queries can lead to performance bottlenecks.
This often happens when systems are not designed to handle high concurrency and complex aggregations, which can degrade user experience and slow down analytics.

Related Concepts

Olap Databases
Real-time Analytics
Data Ingestion Techniques
Performance Optimization Strategies