Load Testing with Koi Pond

Complex systems are difficult to reason about at scale; we often can’t accurately extrapolate system behavior and performance, so we need to derive that data empirically. We use load testing to do just that: find the limits of our systems and weed out bugs at a large scale in a controlled environment. Slack is a…

Shreya Ramesh
16 min readadvanced
--
View Original

Overview

The article discusses Slack's approach to load testing using a tool called Koi Pond, which simulates user interactions to assess system performance under heavy loads. It highlights the complexities of testing a system like Slack and the evolution of their load testing strategies to ensure reliability and scalability.

What You'll Learn

1

How to effectively simulate user behavior for load testing using Koi Pond

2

Why it's crucial to model complex user interactions in load testing

3

When to use formations to test thundering herd scenarios

Prerequisites & Requirements

  • Understanding of API interactions and real-time services
  • Familiarity with load testing tools and methodologies(optional)

Key Questions Answered

How does Koi Pond simulate user interactions for load testing?
Koi Pond simulates user interactions by spinning up slimmed-down versions of Slack clients, called koi, which establish websocket connections and send API calls. This allows for realistic modeling of user behavior and testing of both backend and real-time services under load.
What are the benefits of using Koi Pond over previous load testing tools?
Koi Pond is significantly more cost-effective and scalable compared to previous tools like Puppet Show, allowing Slack to simulate up to 2 million users at a fraction of the cost. It also provides a more realistic testing environment by mimicking actual user behavior and interactions.
What challenges does Slack face in load testing?
Slack faces challenges such as accurately modeling complex user interactions, ensuring the load testing environment reflects real-world usage, and scaling up gradually to avoid overwhelming system components. These challenges necessitate careful planning and execution of load tests.
When should formations be used in load testing?
Formations should be used when testing scenarios that involve multiple users performing the same action simultaneously, such as reacting to a message in a channel. This helps to assess the impact on real-time services and backend systems under realistic load conditions.

Key Statistics & Figures

Cost efficiency of Koi Pond
0.26%
Running 2 million users with Koi Pond costs only 0.26% of running 150,000 users with Puppet Show.
Maximum simulated users tested
2 million
Koi Pond successfully simulated up to 2 million users in a single workspace without major issues.
Initial number of koi simulated
5,000
Testing began with 5,000 koi and scaled up to 500,000 leading up to a customer launch.

Technologies & Tools

Load Testing Tool
Koi Pond
Used to simulate user interactions and assess system performance under load.
Communication Protocol
Websocket
Facilitates real-time communication between clients and the server during load tests.
Backend Service
API
Endpoints called by koi to simulate user actions during load testing.

Key Actionable Insights

1
Implement Koi Pond for load testing to simulate realistic user behavior and interactions.
Using Koi Pond allows for comprehensive testing of both backend services and real-time interactions, ensuring that Slack can handle high loads effectively.
2
Gradually scale the number of simulated users during load tests to identify potential bottlenecks.
This approach helps in understanding the system's limits and prevents overwhelming any single component, which can lead to failures.
3
Combine manual QA testing with automated load testing for critical features.
This hybrid approach ensures that nuanced user interactions are accurately tested, which is crucial for complex systems like Slack.

Common Pitfalls

1
Scaling up too quickly can strain system resources and lead to failures.
When preparing for a customer launch, going from 5,000 to 100,000 koi caused strain on a database. Gradual scaling allows for better monitoring of system performance and identification of issues.
2
Failing to accurately replicate real-world usage patterns can lead to misleading test results.
Using a single SSO token for high-rate calls resulted in no websocket events. Testing with multiple tokens provided a more accurate representation of load on the system.

Related Concepts

Load Testing Methodologies
Real-time Communication Protocols
API Design And Interaction Patterns