How to Introduce Composite Primary Keys in Rails

One line of code can 5x your Rails application performance. What if you could make a small change to your database design that would unlock massively more efficient data access? At Shopify, we dusted off some old database principles and did exactly that with the primary Rails application.

John Arthorne
10 min readadvanced
--
View Original

Overview

The article discusses how Shopify improved database efficiency by introducing composite primary keys in their Rails application. It details the challenges faced during implementation and the significant performance gains achieved, particularly in query speed and database capacity.

What You'll Learn

1

How to implement composite primary keys in a Rails application

2

Why data clustering improves database performance

3

When to consider using composite primary keys over simple primary keys

Prerequisites & Requirements

  • Understanding of Rails Active Record and database design principles
  • Familiarity with MySQL and its InnoDB storage engine(optional)

Key Questions Answered

How do composite primary keys improve database access efficiency?
Composite primary keys group records by related identifiers, reducing the number of disk pages loaded during queries. This clustering aligns with access patterns, leading to faster query responses and lower buffer pool reads, as demonstrated by a 5-6x improvement in query time for critical queries.
What challenges arise when implementing composite primary keys in Rails?
The main challenge is that Active Record assumes a simple integer primary key. This can be mitigated by treating an auto-incrementing secondary key as the primary key, allowing the application to remain unaware of the underlying database schema changes.
What performance trade-offs exist when switching to composite primary keys?
While composite primary keys significantly improve query performance, they can degrade insert performance by approximately 10x due to the need to read and flush more distinct database pages. This trade-off is acceptable in scenarios where reads are more frequent than writes.

Key Statistics & Figures

Median buffer pool reads per query
Reduced from 1.8 to 1.2
This reduction was observed after implementing composite primary keys on the orders table.
Reduction in slow queries
Approximately 80%
This was noted in the log of slow queries after the changes were made.
Improvement in query time
5-6x
This improvement was measured for the most common queries consuming database capacity.
Performance degradation on inserts
10x
This was observed after switching to composite primary keys.
Total elapsed database time reduction
Roughly one hour per day, per shard
This was the net benefit observed across all queries involving the changed table.

Technologies & Tools

Some links below are affiliate links. We may earn a commission if you make a purchase.

Database
Mysql
Used for storing and managing data in the Rails application.
Database
Innodb
The storage engine used in MySQL for handling the data structure.
Backend
Active Record
The Rails component that abstracts database interactions.
Tools
Large Hadron Migrator (lhm)
Used for live schema migrations in the database.
Tools
Ghostferry
Facilitates live data migration across MySQL instances.

Key Actionable Insights

1
Consider implementing composite primary keys for tables with distinct access patterns, such as those in multi-tenant applications.
This approach can lead to significant improvements in query performance, as evidenced by Shopify's experience where they achieved a 5-6x reduction in query time.
2
Evaluate your application's data access patterns before making schema changes.
Understanding how data is accessed can help determine whether a composite primary key will provide the desired performance benefits, as not all tables will benefit equally.
3
Be cautious of the potential for increased complexity in your data migration processes when introducing composite keys.
The introduction of composite primary keys may complicate existing migration tools and processes, necessitating updates to ensure compatibility.

Common Pitfalls

1
Assuming that all tables will benefit from composite primary keys without analyzing access patterns.
This can lead to unnecessary complexity and potential performance degradation if the data access patterns do not align with the new key structure.

Related Concepts

Database Design Principles
Data Clustering
Rails Active Record
Mysql Performance Optimization