Pre-Submit Integration Tests For Ads-Serving

Pinterest Engineering

•

Pinterest Engineering

•7 min read•intermediate•

--

•View Original

JSON

Overview

The article discusses the implementation of pre-submit integration tests for the ads-serving platform at Pinterest, aimed at improving code quality and reducing deployment issues. It highlights the challenges faced with scaling and the need for developers to be accountable for their code, leading to the development of a testing framework that captures nuanced problems before code merges.

What You'll Learn

1

How to implement a pre-submit testing framework for code changes

2

Why shifting debugging responsibility to code authors improves deployment stability

3

How to capture nuanced performance metrics using load testing

Prerequisites & Requirements

Understanding of integration testing concepts
Familiarity with Go programming language and its expvar package

Key Questions Answered

How does the pre-submit testing framework improve code quality?

The pre-submit testing framework allows developers to test their code changes against production-like traffic and metrics before merging. This proactive approach helps identify and resolve issues early, reducing the number of rollbacks by 30% from Q4 2018 to Q1 2019, and shifts the debugging responsibility from on-call engineers to the code authors.

What metrics are monitored by the pre-submit test?

The pre-submit test monitors around 60 key service metrics, including overall success rate, latency, ad insertions, logging volume, expensive RPC calls, goroutine leaks, error logs, and process crashes. This comprehensive monitoring helps catch nuanced bugs that traditional unit tests might miss.

What are the main requirements for the test environment?

The test environment must not serve live Pinner-facing traffic, should not interact with systems serving live traffic, and must not pollute production metrics. Solutions include preventing hosts from registering with the production serverset and disabling background threads that publish metrics to openTSDB.

How does the framework ensure fair load testing?

The framework logs a sample of all production requests to a Kafka topic and uses this data to send identical requests to both the test and golden hosts. This ensures that the load testing accurately reflects production conditions, allowing for unbiased metric comparisons.

Key Statistics & Figures

Reduction in rollbacks

30%

This statistic reflects the decrease in rollbacks from Q4 2018 to Q1 2019 after deploying the pre-submit testing framework.

Number of key service metrics monitored

60

The pre-submit test monitors around 60 key service metrics to capture nuanced bugs.

Technologies & Tools

Some links below are affiliate links. We may earn a commission if you make a purchase.

Programming Language

Go

Used for implementing the ads-serving platform and the metrics computation.

Message Broker

Kafka

Used for logging production requests to facilitate load testing.

Metrics Library

Expvar

Utilized for computing and exposing metrics in JSON format.

Key Actionable Insights

1
Implement a pre-submit testing framework to catch bugs early in the development process.
This approach allows developers to identify issues related to their code changes before they reach production, significantly reducing the risk of rollbacks and improving overall code quality.

2
Encourage developers to take ownership of their code by shifting debugging responsibilities.
By making developers accountable for their changes, the workload on on-call engineers decreases, allowing them to focus on more critical production issues.

3
Utilize metrics comparison to identify performance regressions effectively.
By comparing metrics from test and golden hosts, teams can quickly pinpoint significant variations and address potential issues before they affect users.

Common Pitfalls

1

Failing to test code changes that involve new experiments can lead to undetected issues.

Developers may inadvertently bypass the pre-submit tests for experimental changes, resulting in potential bugs slipping through. It's crucial to ensure that all code changes, including experiments, are adequately tested before merging.