Learn how the Database Platform team performed the most recent MySQL upgrade at Shopify and how this changed our upgrade guidelines moving forward.
Overview
This article discusses the recent MySQL upgrade at Shopify, detailing the motivations behind the upgrade, the challenges faced during the process, and the internal tools developed to streamline future upgrades. The Database Platform team shares insights on rollback testing, the importance of documentation, and the evolution of their infrastructure to support scalability and reliability.
What You'll Learn
1
How to automate MySQL upgrades using internal tooling
2
Why rollback testing is critical before a database upgrade
3
When to implement proactive database management strategies
Prerequisites & Requirements
- Understanding of MySQL database management and operations
- Familiarity with Chef for configuration management(optional)
Key Questions Answered
What were the main reasons for upgrading MySQL at Shopify?
Shopify upgraded MySQL primarily for replication improvements, to handle increased load on replicas, and to ensure better performance during peak times like Black Friday/Cyber Monday. Staying current with software dependencies for security and stability was also a key factor.
How did Shopify automate the MySQL upgrade process?
Shopify utilized a tool called Mason, evolving it into a proactive system that automates the provisioning of upgraded VMs. This allowed the team to define per-shard topologies and reduce manual toil during upgrades.
What challenges did Shopify face during rollback testing?
During rollback testing, Shopify encountered issues where the MySQL server wouldn't start properly after downgrading due to changes in table structure. This led to lengthy startup times and necessitated internal patches to address the problem.
What is the significance of the 'Packed Keys' feature in MySQL?
The 'Packed Keys' feature in MySQL's MyISAM storage engine compresses keys to improve index performance. However, it is not supported by the InnoDB engine, leading to unexpected behavior during upgrades that necessitated careful handling of ALTER TABLE commands.
Key Statistics & Figures
Previous MySQL version
5.7.21
The last version running before the upgrade to 5.7.32.
Current MySQL version
5.7.32
The version to which Shopify upgraded its MySQL databases.
Number of VMs in fleet
thousands
The scale of the virtual machine fleet that supports Shopify's database infrastructure.
Time taken for rollback testing
over a day
The time it took for the MySQL server to calculate transient stats during rollback testing.
Technologies & Tools
Database
Percona Server
Used as the MySQL server variant for Shopify's database infrastructure.
Cloud Infrastructure
Google Compute Engine
Hosts Shopify's horizontally sharded database infrastructure.
Configuration Management
Chef
Used for managing the configuration and provisioning of MySQL servers.
Key Actionable Insights
1Implement comprehensive rollback testing before any database upgrade to identify potential issues early.Extensive rollback testing at Shopify revealed critical bugs that could have jeopardized the upgrade process. This practice ensures that teams can confidently proceed with upgrades knowing they have a fallback plan.
2Utilize automation tools like Mason to reduce manual intervention during database management tasks.By automating VM provisioning and upgrades, Shopify minimized operational toil and improved efficiency, allowing engineers to focus on more strategic initiatives rather than routine maintenance.
3Document upgrade processes and create checklists to streamline future database upgrades.Creating clear documentation and guidelines for upgrades has made it easier for the Database Platform team to execute upgrades quickly and safely, reducing reliance on institutional knowledge.
Common Pitfalls
1
Failing to conduct thorough rollback testing can lead to critical issues during upgrades.
Without adequate rollback testing, teams may encounter unexpected failures that can disrupt service availability and require extensive recovery efforts.
2
Neglecting to document upgrade processes can create confusion and inefficiencies in future upgrades.
Lack of documentation can lead to reliance on individual knowledge, making it difficult for teams to execute upgrades consistently and safely.
Related Concepts
Database Upgrade Strategies
Replication Performance In Mysql
Automation In Database Management