Serving configuration data at scale with high availability

Pinterest Engineering
7 min readintermediate
--
View Original

Overview

The article discusses how Pinterest engineers addressed the challenge of serving configuration data at scale with high availability. It details the transition from using Redis to a more efficient solution involving Apache ZooKeeper and Amazon S3, enabling quick updates and high read access while managing network saturation issues.

What You'll Learn

1

How to implement a high-availability configuration management system using ZooKeeper and S3

2

Why caching is critical for high read access in distributed systems

3

How to manage eventual consistency issues when using S3 for data storage

Prerequisites & Requirements

  • Understanding of distributed systems and caching concepts
  • Familiarity with Apache ZooKeeper and Amazon S3(optional)

Key Questions Answered

What were the limitations of using Redis for configuration data?
Using Redis for configuration data led to network saturation issues as the number of servers increased. When the list was updated, all servers attempted to download the latest copy from a single Redis master, causing connection errors and performance degradation.
How does the new solution using ZooKeeper and S3 improve data availability?
The new solution leverages ZooKeeper for notifications and S3 for storage, which provides high availability and throughput. This design allows for quick updates and reduces the load on ZooKeeper by distributing data across multiple nodes, addressing the previous network saturation issues.
What is the role of the Decider framework in managing feature rollouts?
The Decider framework allows developers to gradually ramp up traffic to new features while monitoring performance. It enables quick adjustments to traffic distribution without requiring redeployments, ensuring a controlled and reliable rollout process.
How does the solution handle S3's eventual consistency model?
To manage S3's eventual consistency, the solution creates a new file for each write instead of updating the existing one. ZooKeeper is used to synchronize the new filename across all nodes, ensuring that updates are consistent and minimizing the risk of stale data.

Key Statistics & Figures

Pin requests per second
hundreds of thousands
This high volume of requests necessitated a robust solution for managing configuration data.
Read access frequency
>100k/sec
The system was designed to handle this level of read access efficiently.

Technologies & Tools

Some links below are affiliate links. We may earn a commission if you make a purchase.

Database
Redis
Initially used for storing configuration data but faced scalability issues.
Infrastructure
Apache Zookeeper
Used as a notifier for configuration updates.
Storage
Amazon S3
Serves as the storage solution for configuration data, providing high availability.

Key Actionable Insights

1
Implement a push-based notification system using ZooKeeper to manage configuration updates efficiently.
This approach reduces the need for clients to poll for updates, allowing for faster convergence of data across distributed systems.
2
Utilize Amazon S3 for storing configuration data to leverage its high availability and throughput capabilities.
S3 can absorb sudden load spikes, making it an excellent choice for applications with high read access and infrequent updates.
3
Adopt a Compare And Swap strategy for updating configuration data to prevent dirty writes.
This ensures that updates only occur if the data has not changed since it was last read, maintaining data integrity during concurrent updates.

Common Pitfalls

1
Relying solely on a single Redis master for configuration data can lead to network saturation and connection errors.
As the number of servers increases, they all attempt to fetch updates simultaneously, overwhelming the Redis master and causing performance issues.

Related Concepts

Distributed Systems
Caching Strategies
Eventual Consistency
Feature Toggling