The Design & Implementation of Sprites

Thomas Ptacek

Replacement-level homeowners buy boxes of pens and stick them in “the pen drawer”. What the elites know: you have to think adversarially about pens. “The purpose of a system is what it does”; a household’s is to uniformly distribute pens. Months fro

Fly.io

•

Thomas Ptacek

•12 min read•advanced•

--

•View Original

ClaudeDockerElixirGeminiHTTPSSQLite

Overview

This article explains the design and implementation of Sprites, Fly.io's new product offering instant-creation Linux VMs with 100GB durable storage backed by object storage. The post details three key architectural decisions that differentiate Sprites from Fly Machines: eliminating container images for instant creation, using S3-compatible object storage instead of attached NVMe for durable disk, and moving orchestration logic inside the VM itself.

What You'll Learn

1

Why eliminating container images enables instant VM creation in cloud platforms

2

How to use S3-compatible object storage as the root of durable VM disk storage instead of attached NVMe

3

How inside-out orchestration (running management services inside the VM) simplifies platform operations and reduces blast radius

4

Why splitting storage into data chunks on object storage and metadata in local SQLite enables fast checkpoint and restore

5

When to choose disposable VMs over traditional container-based deployments for development workflows

Prerequisites & Requirements

Understanding of containers, Docker, and OCI images
Familiarity with cloud infrastructure concepts (VMs, NVMe storage, object storage like S3)
Basic understanding of Linux namespaces and containerization(optional)
Experience with cloud deployment platforms (e.g., Fly.io, AWS EC2)(optional)

Key Questions Answered

What are Fly.io Sprites and how do they differ from Fly Machines?

Sprites are Linux VMs that create in 1-2 seconds with 100GB durable storage, auto-sleep when inactive, and have no time limits. Unlike Fly Machines which require pre-creating containers that can take over a minute, Sprites eliminate user-facing containers entirely. Every worker pre-pools empty Sprites from a standard container, making creation as fast as starting an existing Fly Machine.

Why does removing container images make VM creation instant?

Container images are the primary bottleneck in Fly Machine creation — they're large, take time to pull and unpack, and have poor regional locality (creating on one server in a region doesn't speed up creation on another nearby server). Sprites eliminate this by using a single standard container for all instances, allowing physical workers to pre-pool empty Sprites that are ready immediately.

How does Fly.io use object storage for Sprite disk persistence?

Sprites use a JuiceFS-based storage stack that splits storage into immutable data chunks on S3-compatible object storage and metadata in a local SQLite database kept durable with Litestream. NVMe storage serves only as a sparse read-through cache to reduce read amplification. The durable state of a Sprite is essentially a URL, making migration and recovery from failed physical servers trivial.

What is inside-out orchestration in Sprites?

Inside-out orchestration means the most important management and orchestration code runs inside the VM's root namespace rather than on the host. User code runs in an inner container, while a fleet of services in the root namespace handles storage, checkpoint/restore, service management, logs, and port binding. This reduces blast radius since changes don't restart host components or affect global state.

How do Sprite checkpoints work and why are they fast?

Sprite checkpoints are fast because they only shuffle metadata around rather than copying data. Since data chunks are immutable and stored on object storage, creating a checkpoint or restoring from one merely requires updating the metadata mapping (stored in SQLite with Litestream). This makes checkpoints usable as a routine feature — like git restore rather than a system recovery tool.

Why is attached NVMe storage problematic for cloud VM orchestration?

Attached NVMe storage anchors workloads to specific physical servers, making migration extremely difficult. If a physical server fails, data stored only on its NVMe can be lost. It took Fly.io 3 years to implement workload migration with attached storage, and it's still not easy. Object storage eliminates this anchor, making Sprites trivially migratable since their state is just a URL on durable object storage.

How do Sprites handle networking and service discovery?

Sprites plug into Corrosion, Fly.io's gossip-based service discovery system. When you request a public URL for a Sprite, a Corrosion update propagates across the fleet instantly, making the application available with an HTTPS URL from Fly.io's proxy edges. Port binding (e.g., binding to *:8080) is handled by services in the VM's root namespace.

When should you use Sprites vs Fly Machines for application deployment?

Sprites are optimized for interactive, disposable computing — prototyping, acceptance testing, and AI agent workloads — where instant creation and auto-sleep keep costs low. Fly Machines are better for production e-commerce and professional apps that ship via CI/CD as OCI containers and need workloads kept warm with millisecond responsiveness. The suggested workflow is to prototype on Sprites, then containerize and ship as a Fly Machine to scale.

Key Statistics & Figures

Sprite creation time

1-2 seconds

Time to create and shell into a new Sprite

Sprite root filesystem size

100GB

Durable storage included with every Sprite

Fly Machine creation time

Over 1 minute

Time to create a Fly Machine due to container pull and unpack

Time to implement workload migration with attached storage

3 years

How long it took Fly.io to get workload migration working with Fly Volumes

Technologies & Tools

Some links below are affiliate links. We may earn a commission if you make a purchase.

Virtualization

Kvm

Micro-VM hypervisor underlying both Fly Machines and Sprites

Storage

Juicefs

Storage model for splitting data into chunks on object storage with separate metadata; Sprites use a hacked-up JuiceFS with rewritten SQLite metadata backend

Database

Sqlite

Metadata backend for JuiceFS storage stack and per-account databases in the orchestrator

Replication

Litestream

Makes SQLite metadata durable by replicating to object storage

Storage

S3

S3-compatible object storage serves as the root of durable Sprite storage

Backend

Elixir

Language used for the global Sprite orchestrator

Backend

Phoenix

Web framework for the global Sprite orchestrator application

Infrastructure

Corrosion

Fly.io's gossip-based service discovery system used for instant URL propagation

Storage

Dm-cache

Linux kernel feature; Sprites implement a dm-cache-like NVMe caching layer for object storage chunks

Containerization

Docker

Referenced as the OCI container format that Fly Machines use and Sprites deliberately avoid

Storage

Nvme

Local attached storage used as read-through cache for object storage chunks

Key Actionable Insights

1
Eliminate container images from ephemeral/disposable compute workflows to achieve instant creation times. When all instances run from a standard base image, physical workers can pre-pool empty instances, making creation as fast as starting an already-created VM rather than pulling and unpacking container layers.
This is especially valuable for development environments, AI coding agents, and interactive workflows where creation latency directly impacts developer experience.

2
Use S3-compatible object storage as the root of durable VM storage rather than attached NVMe to decouple workloads from physical servers. This makes migration trivial (the durable state is just a URL), eliminates data loss risk from hardware failure, and enables fast checkpoint/restore by only shuffling metadata.
The JuiceFS model of splitting storage into immutable data chunks on object storage and metadata in local SQLite (backed by Litestream) provides a practical architecture for this approach.

3
Move orchestration and management services inside the VM to reduce blast radius of platform changes. By running storage, service management, logging, and networking services in the VM's root namespace (with user code in an inner container), changes only affect new VMs picking up updates rather than restarting host-level components.
This 'inside-out' architecture also enables bouncing user environments without rebooting the entire VM, since the inner container can be restarted independently from the root namespace services.

4
Use NVMe as a read-through cache layer rather than as primary storage to get the performance benefits of local disk without the operational burden of data durability. Cached chunks are immutable and their canonical state lives on object storage, so nothing on the NVMe volume matters for correctness.
This architecture dramatically simplifies operations since local storage failures are non-events — the cache simply rebuilds from object storage on the next read.

5
Design checkpoint/restore as a first-class feature rather than a disaster recovery escape hatch by making it a metadata-only operation. When data chunks are immutable on object storage, checkpoints become as lightweight as saving a metadata snapshot, enabling routine use similar to git commits.
This shifts the mental model from 'system restore' to 'git restore', encouraging users to checkpoint frequently as part of normal workflow rather than only in emergencies.

Common Pitfalls

1

Relying on attached NVMe storage as the primary durable storage for cloud workloads. When physical servers fail, data on their local NVMe drives can be lost permanently, leaving you dependent on the last snapshot backup. This creates a sharp edge for any workload without explicit replication like multi-node Postgres.

Object storage provides much stronger durability guarantees. Use local NVMe only as a performance cache layer where data loss is inconsequential because the canonical state lives elsewhere.

2

Anchoring workloads to specific physical servers through attached storage, which prevents easy migration and server draining. Fly.io found that adding attached storage (Fly Volumes) eliminated their ability to simply push a 'drain' button on a server, and it took 3 years to rebuild workload migration capabilities.

Design storage so that the durable state is a URL on object storage rather than bits on a specific physical disk. This makes migration as simple as pointing a new VM at the same storage URL.

3

Running all orchestration and management code on the host outside the VM, which means any platform change requires restarting host components and risks metastable failure across the entire fleet. Even benign-looking changes become time-consuming to validate because the blast radius is so large.

Moving orchestration inside the VM means changes only affect new VMs that pick up the update, dramatically reducing risk and speeding up platform development iteration.

4

Using container images for ephemeral or disposable compute where creation speed matters. OCI container images have poor regional locality — creating on one server in a region doesn't cache layers for another nearby server — and pulling and unpacking large containers dominates creation time.

For use cases that don't need custom container images, standardizing on a single base image and pre-pooling instances eliminates this bottleneck entirely.

Related Concepts

Oci Container Images

Kvm Micro-vms

Object Storage Architecture

Juicefs Distributed Filesystem

Dm-cache Kernel Caching

Gossip-based Service Discovery

Workload Migration

Checkpoint And Restore

Linux Namespaces And Containers

Litestream Sqlite Replication

Metastable Failure In Distributed Systems

Read-through Caching Patterns

Immutable Data Chunk Storage