Slashing CI Wait Times: How Pinterest Cut Android Testing Build Times by 36%+

Pinterest Engineering

•

Pinterest Engineering

•15 min read•advanced•

--

•View Original

FirebaseMVPPythonYAML

Overview

This article discusses how Pinterest successfully reduced Android testing build times by over 36% through the implementation of a runtime-aware sharding mechanism. By leveraging historical test data and optimizing their in-house testing platform, they improved the efficiency and reliability of their continuous integration (CI) process.

What You'll Learn

1

How to implement runtime-aware sharding for Android testing

2

Why historical test data is crucial for optimizing CI processes

3

How to reduce CI wait times through effective resource allocation

Prerequisites & Requirements

Understanding of continuous integration and testing frameworks
Familiarity with AWS EC2 and emulator technologies(optional)

Key Questions Answered

How did Pinterest reduce Android testing build times by 36%?

Pinterest implemented a runtime-aware sharding mechanism that uses historical test duration and stability data to balance test execution times across shards. This approach reduced the end-to-end build time by 9 minutes and improved the reliability of the CI process.

What challenges did Pinterest face with Firebase Test Lab?

Pinterest encountered significant setup overhead and instability with Firebase Test Lab, where each test run incurred a baseline setup time of five to six minutes. This overhead became a major portion of the total build duration, contributing to flakiness and reduced developer productivity.

What is runtime-aware sharding and how does it work?

Runtime-aware sharding is a method that assigns tests to shards based on historical runtime data, ensuring that all shards finish around the same time. This method reduces tail latency and improves overall CI feedback times.

What were the results of implementing runtime-aware sharding?

The implementation of runtime-aware sharding led to a reduction in the slowest shard's runtime by 55% and compressed the time difference between the fastest and slowest shards from 597 seconds to just 130 seconds, significantly enhancing developer velocity.

Key Statistics & Figures

Reduction in end-to-end build time

9 minutes

This represents a 36% improvement in build efficiency.

Decrease in slowest shard's runtime

55%

This significant reduction contributed to improved CI performance.

Time difference between fastest and slowest shards

from 597 seconds to 130 seconds

This compression of time differences indicates a more balanced test execution.

Technologies & Tools

Cloud Infrastructure

AWS EC2

Used to host emulators for the in-house testing platform.

Test Management System

Metro

Tracks test results and historical data for optimizing test allocation.

Key Actionable Insights

1
Implementing a runtime-aware sharding mechanism can drastically improve CI efficiency.
By analyzing historical test data, teams can better allocate tests to minimize wait times and enhance reliability, ultimately leading to faster feedback loops.

2
Transitioning from third-party testing platforms to in-house solutions can provide greater control over testing environments.
This allows teams to customize their infrastructure to meet specific needs, reducing external dependencies and improving overall performance.

3
Regularly monitor and analyze test performance metrics to identify bottlenecks.
Continuous evaluation of test runtimes can help teams adjust their sharding strategies and improve the stability of their CI pipelines.

Common Pitfalls

1

Relying solely on equal test counts per shard can lead to uneven runtimes.

This approach may cause some shards to finish significantly later than others, which can delay the entire CI process. It's essential to consider historical performance data when allocating tests to ensure balanced execution times.

Related Concepts

Continuous Integration

Test Automation

Performance Optimization