Real-Time Analytics for Mobile App Crashes using Apache Pinot

Kriti Dangi, Anil Purohit, Parijat Bansal, Rohit Yadav
17 min readintermediate
--
View Original

Overview

The article discusses how Uber utilizes Apache Pinot for real-time analytics of mobile app crashes, enhancing their ability to detect and resolve issues quickly. It highlights the architecture, implementation strategies, and performance improvements achieved through this system.

What You'll Learn

1

How to implement real-time crash analytics using Apache Pinot

2

Why data retention policies are crucial for analytics performance

3

When to use hybrid table setups for data storage

4

How to optimize query patterns for better performance

Prerequisites & Requirements

  • Understanding of real-time data processing concepts
  • Familiarity with Apache Pinot and Kafka(optional)

Key Questions Answered

How does Uber use Apache Pinot for crash analytics?
Uber employs Apache Pinot to process and analyze crash data from mobile applications in real time. This allows them to quickly identify and resolve issues, enhancing user experience and maintaining trust. The system classifies crashes, aggregates data, and provides insights to developers and release managers.
What are the data retention policies for crash analytics at Uber?
Uber retains crash data for 45 days to analyze historical trends, with most use cases accessing data from the last 30 days. This retention policy helps in understanding patterns and improving the overall reliability of the application.
What are the performance improvements after migrating from Elasticsearch to Pinot?
The migration from Elasticsearch to Pinot resulted in significantly improved query performance, especially over extended time periods. Pinot demonstrated lower performance degradation compared to Elasticsearch, making it a more efficient choice for real-time analytics.
What challenges does Uber face with Pinot for crash analytics?
Uber encounters challenges such as the inability to perform complex aggregations on multiple dimensions and the fixed number of segments for offline jobs, which can lead to reliability issues. They have implemented workarounds, such as firing multiple queries in parallel to mitigate these limitations.

Key Statistics & Figures

Changes rolled out weekly at Uber
11,000
This high frequency of changes necessitates a robust system for real-time crash analytics.
Average daily data size for crash logs
36 TB
This significant volume of data underscores the importance of efficient data processing and retention strategies.
Peak crash classification rate
1,500 crashes per second
This rate highlights the need for a scalable analytics solution to handle real-time data influx.

Technologies & Tools

Some links below are affiliate links. We may earn a commission if you make a purchase.

Key Actionable Insights

1
Implement a hybrid table setup to balance real-time and offline data processing needs.
This approach allows for efficient data ingestion and querying, ensuring that even if offline jobs fail, real-time data remains accessible.
2
Utilize data compression techniques to manage large payloads effectively.
By compressing crash event data, Uber can reduce storage costs and improve query performance, which is crucial for maintaining system efficiency.
3
Regularly review and adjust data retention policies to optimize performance.
Maintaining a 45-day retention policy helps in analyzing trends without overwhelming the system, ensuring that only relevant data is kept for analysis.

Common Pitfalls

1
Relying solely on a single query for complex data retrieval can lead to performance bottlenecks.
To avoid this, consider firing multiple parallel queries to distribute the load and enhance response times.

Related Concepts

Real-time Data Processing
Crash Analytics
Data Retention Policies
Performance Optimization Techniques