Benchmarking Apache Kafka: 2 Million Writes Per Second (On Three Cheap Machines)

Jay Kreps

•

Jay Kreps

•18 min read•advanced•

--

•View Original

ApacheApache KafkaCassandraGoogle Compute EngineJavaRabbitMQ

Overview

The article discusses benchmarking Apache Kafka's performance, achieving 2 million writes per second on a modest hardware setup. It highlights Kafka's architecture, producer and consumer throughput, and the impact of message size on performance.

What You'll Learn

1

How to benchmark Apache Kafka's write performance on inexpensive hardware

2

Why Kafka's architecture allows for high throughput and low latency

3

How to configure Kafka for optimal producer and consumer throughput

Prerequisites & Requirements

Understanding of distributed systems and messaging architectures
Familiarity with Apache Kafka and its configuration(optional)

Key Questions Answered

What is the maximum write throughput of Apache Kafka on low-cost hardware?

The article demonstrates that Apache Kafka can achieve up to 2,024,032 records per second with three producer threads using asynchronous replication on a cluster of three machines. This showcases Kafka's efficiency and scalability even on inexpensive hardware.

How does message size affect Kafka's throughput?

Throughput decreases as message size increases, but the total byte throughput increases with larger messages. The article illustrates that while smaller messages are harder to process due to overhead, larger messages lead to better overall performance in terms of MB/second.

What are the latency metrics for message delivery in Kafka?

The median end-to-end latency for message delivery in Kafka is 2 ms, with 3 ms at the 99th percentile and 14 ms at the 99.9th percentile. This indicates Kafka's capability for low-latency message processing.

Key Statistics & Figures

Single producer thread, no replication throughput

821,557 records/sec

This was achieved while producing 50 million small (100 byte

Single producer thread, 3x asynchronous replication throughput

786,980 records/sec

This shows the impact of adding replication on throughput.

Three producers, 3x async replication throughput

2,024,032 records/sec

This demonstrates the aggregate capacity of the Kafka cluster with multiple producers.

Single Consumer throughput

940,521 records/sec

This was measured while consuming from a 6 partition 3x replicated topic.

Technologies & Tools

Some links below are affiliate links. We may earn a commission if you make a purchase.

Backend

Apache Kafka

Used as a distributed messaging system for high-throughput data streaming.

Key Actionable Insights

1
To achieve high throughput in Kafka, utilize multiple producer threads to fully leverage the cluster's capacity.
By running three producer processes, the article demonstrates a significant increase in throughput, reaching over 2 million records per second, which is crucial for applications requiring high data ingestion rates.

2
Consider the impact of message size on performance when designing your Kafka architecture.
The article highlights that smaller messages can lead to higher overhead, suggesting that optimizing message size can improve overall throughput and efficiency in data processing.

3
Implement asynchronous replication to enhance write performance while maintaining data durability.
As demonstrated in the benchmarks, asynchronous replication allows for faster acknowledgments from producers, which can significantly improve write throughput without sacrificing too much reliability.

Common Pitfalls

1

Over-optimizing Kafka configurations for specific benchmarks can lead to misleading results.

The article emphasizes the importance of 'lazy benchmarking' to ensure that performance metrics reflect real-world usage rather than idealized scenarios that may not be applicable in multi-tenant environments.

Related Concepts

Distributed Systems

Messaging Architectures

Performance Benchmarking

Data Streaming