BuildRock: A Build Platform at Slack

Our build platform is an essential piece of delivering code to production efficiently and safely at Slack. Over time it has undergone a lot of changes, and in 2021 the Build team started looking at the long-term vision. Some questions the Build team wanted to answer were: When should we invest in modernizing our build…

Joel Bartlett
13 min readadvanced
--
View Original

Overview

The article discusses BuildRock, Slack's new build platform designed to enhance the efficiency and safety of code deployment. It outlines the challenges faced with the previous Jenkins-based system and details the innovative solutions implemented to modernize the build infrastructure.

What You'll Learn

1

How to modernize a build platform while maintaining existing workflows

2

Why implementing a stateless CI service improves deployment efficiency

3

When to utilize ephemeral agents in a CI/CD pipeline

4

How to integrate security practices into the CI/CD pipeline effectively

Prerequisites & Requirements

  • Understanding of CI/CD principles and Jenkins
  • Familiarity with Kubernetes and AWS(optional)

Key Questions Answered

What challenges did Slack face with its previous Jenkins-based build system?
Slack's previous Jenkins-based build system faced multiple challenges, including difficulty in managing diverse Jenkins clusters, high technical debt, and issues with resource management. These problems led to reduced developer productivity, incidents, and non-optimal resource utilization, ultimately impacting the efficiency of code deployment.
How does BuildRock improve the build process at Slack?
BuildRock enhances the build process by providing a stateless immutable CI service, allowing for quick and safe deployments. It introduces both static and ephemeral Jenkins agents, integrates security practices into the deployment pipeline, and emphasizes a shift-left approach to catch issues early in the development cycle.
What are the key features of the new build platform at Slack?
Key features of BuildRock include stateless CI services, the use of ephemeral and static agents, integrated security practices, and a focus on standardization and abstraction. These features collectively improve deployment efficiency and reduce the risks associated with build processes.
What impact did the new build platform have on Slack's business?
The new build platform significantly reduced time to market for individual services, improved the speed of addressing security vulnerabilities, and standardized the Jenkins inventory. This led to faster deployments and a more efficient build process, ultimately enhancing overall productivity.

Technologies & Tools

Some links below are affiliate links. We may earn a commission if you make a purchase.

Key Actionable Insights

1
Implementing a stateless CI service can drastically improve deployment efficiency.
By separating business logic from build infrastructure, teams can deploy changes more quickly and with fewer errors. This approach is particularly useful for organizations looking to streamline their CI/CD processes.
2
Utilizing ephemeral agents can optimize resource usage in CI/CD pipelines.
Ephemeral agents only run during build jobs and are terminated afterward, which can lead to significant cost savings and resource efficiency, especially in cloud environments.
3
Integrating security practices directly into the CI/CD pipeline is crucial.
By incorporating security checks during the build process, organizations can catch vulnerabilities early, reducing the risk of security incidents in production.

Common Pitfalls

1
Failing to standardize Jenkins configurations can lead to 'snowflake' clusters that are difficult to manage.
Without standardization, each Jenkins instance may have unique configurations and dependencies, complicating upgrades and maintenance. To avoid this, organizations should implement a centralized configuration management strategy.