Pummelling the Platform–Performance Testing Shopify

Insight into the tools Shopify uses to raise confidence in our ability to serve large sales events, as well as the experimentation and regression framework built to ensure that we’re getting better, week-over-week, at handling load.

Chris Inch
14 min readadvanced
--
View Original

Overview

The article discusses Shopify's approach to performance testing, emphasizing the importance of ensuring stability and speed during high-traffic sales events like Black Friday Cyber Monday (BFCM). It outlines the tools and methodologies used for load and stress testing, as well as the internal processes that support continuous improvement in performance testing.

What You'll Learn

1

How to conduct load testing to verify service performance under heavy traffic

2

Why stress testing is crucial for understanding service limits

3

How to use HAR files for realistic load testing

4

When to implement performance testing in the development cycle

Prerequisites & Requirements

  • Understanding of performance testing concepts
  • Familiarity with load testing tools like Puppeteer or similar(optional)

Key Questions Answered

What types of performance testing does Shopify conduct?
Shopify conducts load testing and stress testing as part of its performance testing strategy. Load testing verifies that services can handle a specific number of requests, while stress testing determines the upper limits of service capacity by applying excessive load until the service fails.
How does Shopify generate realistic load for testing?
Shopify generates realistic load using a homegrown tool that sends requests to specific endpoints. This tool employs Lua scripts to simulate user behavior and can issue tens of millions of requests per minute, allowing for effective stress and load testing.
What is the purpose of the Cronograma tool at Shopify?
Cronograma is an internal tool that facilitates the setup and tracking of performance testing experiments. It allows developers to define hypotheses, run tests, and observe results, ensuring that performance testing follows a scientific method for repeatability and clarity.
Why did Shopify pivot from browser-based load testing to HAR-based testing?
Shopify pivoted to HAR-based testing due to the high computing costs associated with browser-based solutions. HAR files allow for the simulation of realistic browsing sessions without the overhead of running full browsers, enabling scalability to tens of millions of sessions.

Key Statistics & Figures

Number of merchants relying on Shopify
over 1 million
This statistic highlights the scale at which Shopify operates, particularly during major sales events.
Sales during BFCM 2020
$5.1 billion
This figure represents the record-breaking sales achieved by Shopify merchants during the Black Friday Cyber Monday weekend.
Requests per minute handled during load tests
1 million
This is an example of the load testing scenario used to validate service capacity.

Technologies & Tools

Some links below are affiliate links. We may earn a commission if you make a purchase.

Backend
Go
Used for the load generator that executes Lua scripts for performance testing.
Scripting
Lua
Used to write scripts that simulate user behavior during load tests.
Data Format
Har Files
Utilized for realistic load testing by capturing network requests made during browsing sessions.
Testing Tool
Puppeteer
Initially explored for browser-based load testing before pivoting to HAR-based solutions.

Key Actionable Insights

1
Implementing a robust load testing strategy is essential for ensuring service reliability during peak traffic events.
By proactively identifying bottlenecks through load testing, teams can prevent performance issues that could impact sales during critical events like BFCM.
2
Utilizing HAR files can significantly enhance the realism of load tests without incurring high costs.
This method allows developers to simulate actual user behavior more accurately, which is crucial for understanding how services will perform under real-world conditions.
3
Encouraging a culture of performance testing across all teams can lead to overall platform improvements.
When performance testing becomes a common practice, it fosters a proactive approach to identifying and resolving potential issues before they affect customers.

Common Pitfalls

1
Assuming that all requests are equal in load testing can lead to misleading results.
Different types of requests can have varying impacts on system performance, so it's essential to test a range of realistic endpoints.
2
Neglecting to simulate the actual traffic shape during load tests can result in inaccurate assessments.
Flash sales often have spiky traffic patterns, and failing to replicate this can lead to underestimating system stress.

Related Concepts

Performance Testing
Load Testing
Stress Testing
Scientific Method In Testing