How LedgerStore Supports Trillions of Indexes at Uber

Kaushik Devarajaiah
13 min readadvanced
--
View Original

Overview

The article discusses how Uber's LedgerStore manages trillions of indexes to support its vast transactional data, emphasizing the architecture and indexing strategies that ensure data integrity and performance. It highlights the transition from DynamoDB to Docstore, showcasing the benefits of this migration in terms of cost and efficiency.

What You'll Learn

1

How to implement strongly consistent indexes for transactional systems

2

Why to use eventually consistent indexes for non-critical data

3

How to manage index lifecycle effectively in a large-scale system

Prerequisites & Requirements

  • Understanding of database indexing concepts
  • Familiarity with distributed databases like DynamoDB and Docstore(optional)

Key Questions Answered

How does LedgerStore support trillions of indexes at Uber?
LedgerStore utilizes a robust architecture that allows for the creation and management of trillions of indexes, ensuring data integrity and performance across billions of transactions. This is achieved through various types of indexes, including strongly consistent and eventually consistent indexes, tailored to specific use cases.
What are the benefits of migrating from DynamoDB to Docstore?
Migrating from DynamoDB to Docstore has resulted in significant cost savings, estimated at over $6 million annually, and improved operational efficiency by consolidating technology and reducing external dependencies. This transition also enhanced the indexing capabilities of LedgerStore.
What challenges are associated with maintaining petabyte-scale indexes?
Maintaining petabyte-scale indexes presents challenges such as imbalanced partitioning, high read/write amplification, and noisy neighbor problems. These issues necessitate careful data modeling and isolation strategies to ensure system performance and reliability.

Key Statistics & Figures

Unique indexes built
over 2 trillion
This figure illustrates the scale at which LedgerStore operates, ensuring data consistency across Uber's transactions.
Estimated yearly savings
over $6 million
This savings is a direct result of migrating from DynamoDB to Docstore, highlighting the cost-effectiveness of the new architecture.

Technologies & Tools

Some links below are affiliate links. We may earn a commission if you make a purchase.

Database
Docstore
Used for managing indexes efficiently at scale.
Database
Dynamodb
Previously used for index management before migrating to Docstore.

Key Actionable Insights

1
Implement strongly consistent indexes for critical financial transactions to avoid issues like duplicate charges.
This is particularly important in scenarios where immediate visibility of changes is crucial, such as during credit card authorization processes.
2
Leverage eventually consistent indexes for non-critical data to enhance system performance and reduce latency.
This approach is suitable for use cases like payment history where slight delays in data visibility are acceptable.
3
Establish a robust index lifecycle management system to handle the creation, validation, and decommissioning of indexes.
This ensures that indexes remain relevant and efficient as business requirements evolve, minimizing operational overhead.

Common Pitfalls

1
Failing to manage partitioning effectively can lead to hot partitions and throttling issues.
This often occurs when data is clustered around specific timestamps, causing uneven load distribution. To avoid this, implement strategies that ensure uniform data distribution across partitions.

Related Concepts

Database Indexing Strategies
Distributed Database Management
Data Integrity In Financial Systems