Overview
The article discusses Uber's implementation of Presto Express, an enhancement to the Presto SQL query engine aimed at reducing the end-to-end Service Level Agreement (SLA) for short-running queries. It details the challenges faced with query slowness, the design and implementation of express queries, and the significant performance improvements achieved.
What You'll Learn
1
How to identify express queries using historical data
2
Why separating express clusters can improve query performance
3
How to implement a new queue system for express queries
Prerequisites & Requirements
- Understanding of SQL query execution and performance metrics
- Familiarity with distributed systems and data analytics(optional)
Key Questions Answered
What is Presto Express and how does it improve query performance?
Presto Express is an enhancement to the Presto SQL query engine designed to reduce the end-to-end SLA for short-running queries. By implementing a separate queue for express queries and optimizing resource allocation, it significantly improves performance, allowing more queries to be processed quickly.
How does Uber identify express queries?
Uber identifies express queries by analyzing historical execution times using exact and abstract fingerprints. If the P90 or P95 execution time of a query in the last few days is under 2 minutes, it is classified as an express query, allowing for optimized processing.
What challenges did Uber face with Presto before implementing express queries?
Uber faced significant query slowness due to throttling and concurrency limits in Presto clusters, which led to the need for additional capacity. This throttling resulted in long queuing times for users, particularly for non-express queries.
What impact did the express feature have on query performance metrics?
The express feature allowed low-tier express clusters to use only 10% of total resources while handling about 75% of the batch low-tier queries. It also reduced the P90 queuing time for express queries to about 10 seconds compared to hours for non-express queries.
Key Statistics & Figures
Weekly active users
12,000
These users run approximately 500,000 queries daily, reading around 100 PB from HDFS.
P90 queuing time for express queries
10 seconds
This is significantly lower compared to hours for non-express queries.
Percentage of batch low-tier queries handled by express clusters
75%
Express clusters utilize only 10% of the total batch low-tier Presto resources.
Technologies & Tools
Some links below are affiliate links. We may earn a commission if you make a purchase.
Backend
Presto
Used as the SQL query engine for interactive analytic queries.
Data Source
Apache Hive
One of the data sources queried through Presto.
Data Source
Apache Pinot
Used for real-time querying of data.
Data Source
Mysql
Another data source queried through Presto.
Data Source
Apache Kafka
Used for streaming data into Presto.
Key Actionable Insights
1Implement a separate queue for express queries to enhance performance.By routing express queries to a dedicated queue, organizations can significantly reduce queuing times and improve overall SLA for users, as seen in Uber's implementation.
2Utilize historical data to predict query performance and classify express queries.Leveraging historical execution times can help in accurately identifying express queries, thus optimizing resource allocation and reducing latency.
3Consider separating express clusters from existing batch clusters.This separation can lead to better resource utilization and performance, as express clusters can handle more queries without being constrained by the limits of batch processing.
Common Pitfalls
1
Overlooking query throttling in the execution process.
This can lead to express queries being idle despite being designed for quick execution, resulting in increased load on other clusters.
2
Failing to match resource allocation with query requirements.
Express queries require adequate resources to ensure they do not get blocked by non-express queries, which can lead to performance bottlenecks.
Related Concepts
Distributed SQL Query Engines
Performance Optimization Techniques
Data Analytics Best Practices