Docker at Shopify: How We Built Containers that Power Over 100,000 Online Shops

This is the second in a series of blog posts describing our evolution of Shopify toward a Docker-powered, containerized data center. This instalment will focus on the creation of the container used in our production environment when you visit a Shopify storefront. Read the first post in this series here. Why containerize? Before we dive into the mechanics of building containers, let's discuss motivation. Containers have the potential to do for the datacenter what consoles did for gaming. In the early days of PC gaming, each game typically required video or sound driver massaging before you got to play. Gaming consoles however, offered a different experience: predictability: cartridges were self-contained fun: always ready-to-run, with no downloads or updates. fast: cartridges used read-only memory for lightning fast speeds. easy: cartridges were robust and largely child-proof - they were quite literally plug-and-play. Predictable, fast, and easy are all good things at scale. Docker containers provide the building blocks to make our data centers easier to run and more adaptable by placing applications into self-contained, ready-to-run units much like cartridges did for console games.

Graeme Johnson
13 min readintermediate
--
View Original

Overview

This article discusses Shopify's transition to a Docker-powered, containerized data center, detailing the creation of containers that support over 100,000 online shops. It covers the motivations for containerization, the bootstrapping process, environment setup, and best practices for managing containers effectively.

What You'll Learn

1

How to effectively bootstrap a Docker container for production environments

2

Why using 'thin' containers can optimize resource usage

3

When to apply the '100 rule' for service management in containers

4

How to structure applications within Docker containers for scalability

Prerequisites & Requirements

  • Understanding of Docker and containerization concepts
  • Familiarity with Chef for managing production nodes(optional)

Key Questions Answered

What are the benefits of using Docker containers in a data center?
Docker containers provide predictability, speed, and ease of use, similar to gaming consoles, by encapsulating applications into self-contained units. This approach simplifies management and scaling of applications across a large data center environment.
How does Shopify manage logging in a containerized environment?
Shopify avoids running multiple log relay agents by implementing a single daemon on the host that subscribes to Docker events. This daemon configures log indexing for each container on start and cleans up on destroy, ensuring efficient log management without redundancy.
What is the '100 rule' in container service management?
The '100 rule' suggests evaluating whether it's necessary to run multiple instances of a service in containers. Instead, it encourages sharing a single service instance across multiple containers to optimize resource usage and reduce overhead.
How does Shopify structure its Docker containers for applications?
Shopify prefers 'thin' containers that focus on a single responsibility, allowing for fine-grained scaling. Applications are organized with a specific directory structure, ensuring that they are easy to manage and deploy within the container environment.

Key Statistics & Figures

Number of online shops powered by Shopify containers
100,000
This statistic highlights the scale at which Shopify operates its containerized environment.
Lines in the base Dockerfile
125
The base Dockerfile defines the shared environment for all containers at Shopify, ensuring consistency and efficiency.
Number of packages in the base image
25
These packages include essential language runtimes and development tools necessary for application functionality.

Technologies & Tools

Some links below are affiliate links. We may earn a commission if you make a purchase.

Containerization
Docker
Used for creating and managing containers that run Shopify applications.
Configuration Management
Chef
Utilized to manage production nodes and automate the setup of the environment.
Message Broker
Kafka
Used as an event bus for publishing real-time events within the Shopify stack.
Monitoring
Statsd
Employed for collecting runtime statistics from various levels of the application.

Key Actionable Insights

1
Adopt 'thin' containers to optimize resource consumption and performance.
By minimizing the components within each container, you can reduce CPU and memory usage, leading to faster application performance and easier management.
2
Implement the '100 rule' to streamline service deployment in containers.
This approach encourages sharing services across containers, which can significantly reduce redundancy and improve resource efficiency in a containerized environment.
3
Utilize a single logging daemon to manage logs efficiently across containers.
This setup prevents the overhead of running multiple logging instances, ensuring that log management is centralized and effective, which is crucial for diagnosing production issues.
4
Structure your application code and Dockerfiles with clear conventions.
Establishing a standard layout for your applications within containers can simplify development and deployment processes, making it easier for teams to collaborate and maintain code.

Common Pitfalls

1
Failing to properly configure logging can lead to lost data in containerized environments.
Since containers can exit and their filesystems disappear, it's crucial to implement a centralized logging solution to ensure that logs are preserved and accessible for troubleshooting.
2
Overcomplicating container setups with unnecessary services.
Keeping containers lean and focused on single responsibilities prevents resource bloat and enhances performance, making it easier to scale applications effectively.

Related Concepts

Container Orchestration
Microservices Architecture
Continuous Integration And Deployment