Migrating a Trillion Entries of Uber’s Ledger Data from DynamoDB to LedgerStore

Raghav Gautam, Erik Seaberg, Abhishek Kanhar
12 min readadvanced
--
View Original

Overview

This article details Uber's migration of over a trillion entries of ledger data from DynamoDB to LedgerStore, focusing on the challenges, strategies, and outcomes of the process. It emphasizes the importance of immutability, cost savings, and the need for a seamless transition without service disruption.

What You'll Learn

1

How to migrate large datasets without downtime

2

Why immutability is crucial for ledger-style databases

3

How to implement effective data validation strategies during migration

Prerequisites & Requirements

  • Understanding of database migration principles
  • Familiarity with Apache Spark for data processing(optional)

Key Questions Answered

What were the main reasons for migrating from DynamoDB to LedgerStore?
The migration was driven by the need for a more cost-effective solution, better suited for storing immutable ledger data, and the desire to simplify the storage architecture by consolidating data management into a single system. This change aimed to enhance performance and reduce operational complexity.
How did Uber ensure data integrity during the migration process?
Uber employed shadow validation and offline validation techniques to ensure data integrity. Shadow validation compared responses from the old and new systems during migration, while offline validation involved comparing complete datasets to identify and backfill any missing records.
What challenges did Uber face during the backfill process?
Challenges included managing scalability, ensuring fault tolerance, and handling data quality issues. The team had to implement incremental backfills to avoid overwhelming the system and to ensure that data was accurately written without causing service disruptions.
What strategies were used for rate control during the backfill?
Uber implemented rate control mechanisms to manage the backfill job's speed, allowing adjustments based on current system load. This included using Guava's RateLimiter to ensure consistent performance and prevent overwhelming the system during high traffic periods.

Key Statistics & Figures

Total entries migrated
over a trillion
This figure represents the scale of the data migration effort undertaken by Uber.
Compressed size of immutable records
1.2 PB
This statistic highlights the volume of data being handled during the migration.
Uncompressed size of secondary indexes
0.5 PB
This indicates the additional storage requirements for indexing the migrated data.

Technologies & Tools

Some links below are affiliate links. We may earn a commission if you make a purchase.

Key Actionable Insights

1
Implement shadow validation during data migrations to ensure accuracy and completeness.
Shadow validation allows for real-time comparison between old and new data sources, helping to identify discrepancies early in the migration process.
2
Utilize offline validation to address issues with rarely accessed historical data.
This method ensures that all records, especially those not frequently accessed, are validated and backfilled correctly, preventing potential data integrity issues.
3
Adopt a phased rollout strategy for new systems to mitigate risk.
Gradually introducing the new system allows for monitoring and adjustments based on real-time feedback, reducing the likelihood of major disruptions.

Common Pitfalls

1
Failing to implement effective rate control can lead to system overload during backfills.
Without proper rate limiting, backfill jobs can generate excessive load, potentially causing service disruptions. It's crucial to monitor system performance and adjust the rate of data processing accordingly.
2
Neglecting to validate historical data can result in undetected data integrity issues.
If validation focuses only on recent data, older records may contain errors that go unnoticed, leading to long-term data quality problems. Comprehensive validation strategies must include all data, regardless of access frequency.

Related Concepts

Database Migration Strategies
Data Validation Techniques
Cost Management In Cloud Databases