Spinner: Pinterest’s Workflow Platform

Pinterest Engineering
23 min readadvanced
--
View Original

Overview

The article discusses Spinner, Pinterest's workflow platform, detailing its evolution from an in-house scheduler called Pinball to Apache Airflow. It highlights the challenges faced with the previous system, the reasons for adopting Airflow, and the architectural changes made to optimize workflow management.

What You'll Learn

1

How to implement Apache Airflow for workflow management

2

Why migrating from an in-house scheduler to Apache Airflow enhances scalability

3

How to optimize performance through benchmarking and testing

Prerequisites & Requirements

  • Understanding of workflow management systems and Apache Airflow
  • Familiarity with Kubernetes and CI/CD processes(optional)

Key Questions Answered

What were the limitations of Pinterest's previous workflow system, Pinball?
The Pinball system faced several limitations including high job start delays, scalability issues due to stateful components, and lack of features like access control and documentation. These challenges prompted Pinterest to seek a more robust solution.
Why did Pinterest choose Apache Airflow over other workflow management tools?
Pinterest selected Apache Airflow due to its alignment with user needs, industry adoption, Python-based DSL, modular code structure, and scalability features. These factors made it a suitable choice for their growing workflow demands.
How does Pinterest's multi-partition scheduler improve workflow management?
The multi-partition scheduler allows multiple schedulers to manage different DAG partitions, preventing interference and optimizing performance. This setup supports higher priority workflows with tighter SLAs, enhancing overall efficiency.
What performance improvements were observed after migrating to Airflow?
Post-migration, Airflow demonstrated a smaller scheduling delay of 50 seconds compared to 180 seconds with Pinball, and it could handle 3000 DAGs with 25 tasks each at a 10-minute delay, significantly improving performance.

Key Statistics & Figures

Daily data ingestion
600 terabytes
This volume of data is critical for Pinterest's operations and informs the need for robust workflow management.
Total data stored
500 petabytes
This vast amount of data necessitates efficient processing and workflow management solutions.
Daily flow executions
10,000
This statistic underscores the scale at which Pinterest operates and the importance of an effective workflow system.
Daily job executions
38,000
This figure highlights the operational demands placed on Pinterest's workflow management system.
Scheduling delay comparison
50 seconds vs 180 seconds
This improvement illustrates the enhanced efficiency of Airflow over the previous Pinball system.

Technologies & Tools

Some links below are affiliate links. We may earn a commission if you make a purchase.

Key Actionable Insights

1
Implementing a multi-partition scheduler can significantly enhance workflow efficiency.
By distributing workload across multiple schedulers, organizations can reduce bottlenecks and improve response times, especially for high-priority tasks.
2
Regular benchmarking and performance testing are crucial for maintaining system efficiency.
Conducting performance tests helps identify bottlenecks and informs necessary adjustments to the architecture, ensuring that the system can scale effectively.
3
Utilizing a serialized DAG representation can improve workflow performance.
Caching the DAG representation reduces processing time for each call, which is particularly beneficial when managing thousands of workflows.

Common Pitfalls

1
Failing to properly manage DAG partitions can lead to performance bottlenecks.
Without a well-structured partitioning strategy, schedulers may compete for resources, causing delays and inefficiencies.
2
Neglecting the importance of documentation can hinder user adoption.
Inadequate documentation can lead to confusion among users, making it difficult for them to leverage the full capabilities of the workflow platform.

Related Concepts

Workflow Management Systems
Apache Airflow
Kubernetes Orchestration
Continuous Integration And Continuous Deployment