Overview
This article details Pinterest's experience upgrading their Batch Processing Platform, Monarch, from Hadoop 2.7.1 to Hadoop 2.10.0. It outlines the challenges faced, the upgrade strategy implemented, and the solutions developed to ensure a smooth transition without impacting performance or service level agreements.
What You'll Learn
1
How to upgrade a large-scale Hadoop cluster without downtime
2
Why incremental upgrades are essential for minimizing service disruption
3
How to address compatibility issues between different Hadoop versions
Prerequisites & Requirements
- Understanding of Hadoop architecture and versioning
- Experience with large-scale data processing systems
Key Questions Answered
What were the main challenges faced during the Hadoop upgrade at Pinterest?
The main challenges included managing hundreds of internal patches specific to Pinterest, ensuring no downtime for critical workloads, and maintaining performance standards during the upgrade process. The complexity of backporting features and bug fixes from the older version also posed significant hurdles.
How did Pinterest ensure compatibility between Hadoop 2.7 and 2.10 during the upgrade?
Pinterest ensured compatibility by implementing an incremental upgrade strategy, allowing user jobs to continue using Hadoop 2.7 while upgrading the platform to Hadoop 2.10. They also conducted extensive validations to ensure that applications built on Hadoop 2.7 would function correctly in the new environment.
What specific solutions were developed to address dependency issues during the upgrade?
To address dependency issues, Pinterest modified user jobs to be compatible with Hadoop platform dependencies or shaded the versions within their job artifacts. They also ensured that Hadoop 2.7 jars were not included in user jars to prevent conflicts during runtime.
Key Statistics & Figures
Number of nodes in the Hadoop clusters
17k+ nodes
This reflects the scale of Pinterest's Batch Processing Platform, Monarch, which operates on AWS EC2.
Technologies & Tools
Backend
Hadoop
Used as the core framework for batch processing in Pinterest's Monarch platform.
Cloud Infrastructure
AWS EC2
Provides the underlying infrastructure for running the Hadoop clusters.
Storage
S3
Used for persistent storage of input and output data in the batch processing tasks.
Key Actionable Insights
1Implement an incremental upgrade strategy when transitioning between major versions of software to minimize disruption.This approach allows for a smoother transition and helps identify potential issues in a controlled manner, reducing the risk of service interruptions.
2Conduct thorough validations of user-defined applications before and after upgrades to ensure compatibility.Validating applications helps catch issues early, reducing the time and resources needed to address problems that arise post-upgrade.
3Decouple user applications from platform dependencies to streamline upgrades and reduce conflicts.By ensuring that user applications do not carry legacy dependencies, organizations can avoid many common pitfalls associated with version upgrades.
Common Pitfalls
1
Failing to validate user applications before upgrading can lead to significant downtime and service disruptions.
Without proper validation, organizations risk encountering compatibility issues that can halt processing and affect service level agreements.
2
Not decoupling user applications from platform dependencies can result in conflicts during runtime.
When applications carry legacy dependencies, it complicates the upgrade process and can lead to unexpected failures.
Related Concepts
Hadoop Architecture
Batch Processing Systems
Incremental Upgrade Strategies
Dependency Management In Software Upgrades