•Vijayant Soni, Sashidhar Thallam, Sakshi Pande, Atul Mantri•10 min read•intermediate•
--
•View OriginalOverview
The article discusses Spark Analysers, a system developed by Uber to identify anti-patterns in Spark applications. It highlights the challenges of optimizing Spark apps at scale and presents the architecture and components of the Spark Analysers system.
What You'll Learn
1
How to detect anti-patterns in Spark applications using Spark Analysers
2
Why optimizing Spark applications is crucial for resource efficiency at scale
3
How to implement a Flink application to analyze Spark events in real-time
Prerequisites & Requirements
- Understanding of Apache Spark and its architecture
- Familiarity with Kafka and Flink(optional)
Key Questions Answered
What are Spark Analysers and how do they work?
Spark Analysers are components designed to detect anti-patterns in Spark applications. They consist of a Spark Event Listener that captures events during application execution and a Flink application that analyzes these events in real-time to identify inefficiencies.
What are the main components of the Spark Analysers system?
The Spark Analysers system comprises two main components: the Spark Event Listener, which listens for events emitted by Spark applications, and the Analysers, which are implemented as a Flink application that processes these events to identify anti-patterns.
How does the Excessive Partition Scan Analyser function?
The Excessive Partition Scan Analyser checks events in the Kafka topic for data scan information. It evaluates the type of tables being scanned and applies threshold validations to determine if an anti-pattern event should be created and pushed to a different Kafka topic.
What is YARNRed and how does it relate to Spark Analysers?
YARNRed is an initiative at Uber aimed at reducing YARN resource consumption. It utilizes insights from Spark Analysers to create Jira tickets for applications that exhibit inefficiencies, thereby promoting resource optimization.
Key Statistics & Figures
Daily anti-pattern detections
over 5000
This figure represents the number of anti-patterns detected across more than 1500 distinct applications daily.
Annual savings from current detections
60k+ uCores
This statistic highlights the efficiency gains achieved through the implementation of Spark Analysers.
Candidates for ticket creation after cost filtering
~150
These candidates represent applications that could benefit from optimizations based on the analysis performed by Spark Analysers.
Technologies & Tools
Some links below are affiliate links. We may earn a commission if you make a purchase.
Backend
Apache Spark
Used as the primary distributed computing engine for batch processing at Uber.
Backend
Apache Flink
Employed for real-time data processing and analysis of Spark events.
Messaging
Kafka
Utilized for event streaming between Spark applications and the Flink analysers.
Key Actionable Insights
1Implement Spark Analysers in your Spark applications to identify and rectify anti-patterns.By integrating Spark Analysers, you can enhance application performance and reduce resource consumption, which is crucial for large-scale operations.
2Utilize the Kafka messaging system to monitor Spark events effectively.Leveraging Kafka allows for real-time processing and analysis of Spark application events, enabling quicker identification of inefficiencies.
3Regularly review and act on the Jira tickets generated by YARNRed.This proactive approach helps in optimizing Spark applications and ensuring efficient resource usage, ultimately leading to cost savings.
Common Pitfalls
1
Failing to optimize Spark applications can lead to inefficient resource usage.
This often occurs when users are not aware of the intricacies of Spark, resulting in unoptimized applications that consume excessive resources.
2
Ignoring the insights provided by Spark Analysers and YARNRed can lead to missed optimization opportunities.
Without acting on the recommendations from these systems, application owners may continue to incur unnecessary costs and inefficiencies.
Related Concepts
Apache Spark Optimization Techniques
Real-time Data Processing With Flink
Resource Management In Distributed Systems