Overview
This article details how an intern at Shopify successfully freed up 3 terabytes of storage in preparation for Black Friday Cyber Monday (BFCM) by implementing a background job to delete unnecessary database records. The process improved database performance and resilience during high traffic periods.
What You'll Learn
1
How to build a Rails background job for database maintenance
2
Why using throttling in database operations is crucial for performance
3
When to leverage existing libraries for efficient data handling
Prerequisites & Requirements
- Basic understanding of MySQL and database management
- Familiarity with Rails and background job processing(optional)
Key Questions Answered
How did the intern manage to delete 5 billion records without causing downtime?
The intern utilized a Rails background job that processed deletions in small batches, allowing for pauses and checks to prevent database overload. This approach, combined with internal throttling mechanisms, ensured that the database remained responsive during the operation.
What was the impact of deleting unnecessary records on database performance?
After the deletion, the database was scanning approximately 3 times fewer pages to return a single record. This significant reduction in page scans allowed the database to handle increased traffic during flash sales more efficiently.
What tools did the intern use to facilitate the background job?
The intern leveraged Shopify's job-iteration library to manage background job iterations and an internal throttling enumerator to monitor database load. These tools helped ensure that deletions did not overwhelm the database during peak operations.
What challenges did the intern face while running the background job?
The intern was nervous about potential downtime for merchants if the background job failed. However, by carefully managing the deletion process and using existing libraries, they successfully deleted records at a peak rate of six million records per minute without causing issues.
Key Statistics & Figures
Storage freed
3 terabytes
This storage was reclaimed by deleting unnecessary records before BFCM.
Records deleted
5 billion records
The intern's background job processed deletions at a peak rate of six million records per minute.
Page scans reduction
3x fewer pages
The database scanned significantly fewer pages to return a single record after the deletion.
Technologies & Tools
Some links below are affiliate links. We may earn a commission if you make a purchase.
Backend
Rails
Used to build the background job for deleting records.
Database
Mysql
The database system where the records were stored and managed.
Library
Job-iteration
A Shopify library used to manage background job iterations efficiently.
Key Actionable Insights
1Implementing background jobs in small batches can significantly reduce the risk of database overload during high-demand periods.This approach allows for better resource management and ensures that critical operations can continue without interruption, especially during events like BFCM.
2Utilizing existing libraries and frameworks can streamline complex tasks and reduce development time.By leveraging Shopify's job-iteration library, the intern was able to focus on the logic of the job rather than the underlying mechanics, leading to a more efficient implementation.
3Throttling mechanisms are essential for maintaining database health during large-scale operations.The internal throttling enumerator helped the intern monitor database load, pausing operations when necessary to prevent performance degradation.
Common Pitfalls
1
Performing large batch deletions without proper management can lead to database overload and downtime.
This often happens when developers do not consider the impact of their operations on database performance, especially during peak traffic times.
Related Concepts
Database Optimization Techniques
Background Job Processing
Throttling In Database Operations
Mysql Indexing Strategies