Improving performance and capacity for Espresso with new Netty framework

Will Hu

•

Will Hu

•13 min read•advanced•

--

•View Original

JavaWebSocket

Overview

This article discusses the migration of Espresso, LinkedIn's distributed data store, to a new Netty4-based framework, highlighting significant performance and capacity improvements. It details the architecture of Espresso, the benefits of the Netty framework, and the results of the migration in terms of latency and throughput.

What You'll Learn

1

How to modernize a distributed data store using the Netty framework

2

Why implementing a new thread model can enhance performance in multi-threaded applications

3

How to measure performance improvements in a large-scale system migration

Key Questions Answered

What improvements were achieved by migrating Espresso to the Netty framework?

The migration to the Netty framework resulted in significant performance improvements, including a 100x reduction in OldGen GC and a 60% reduction in latency for write-heavy clusters. Read-heavy clusters also experienced latency improvements due to the new thread model and native epoll support.

What are the major features implemented in the new Netty framework for Espresso?

The new Netty framework for Espresso includes a new thread model for I/O operations, better memory management with direct buffer pooling, streamlined asynchronous handling, and native epoll support for socket connections, all aimed at enhancing performance and capacity.

How does the new thread model improve Espresso's performance?

The new thread model avoids inter-thread locking by using thread-local variables, which enhances CPU cache hit rates and reduces contention between I/O threads, leading to improved performance in data processing.

What metrics were used to measure the performance of Espresso after migration?

Metrics used for performance measurement included JVM GC times, latency (P99 and max latencies), and capacity measured in Read Capacity Units (RCU) and Write Capacity Units (WCU), providing a comprehensive view of system performance pre- and post-migration.

Key Statistics & Figures

OldGen GC reduction

100x

Observed in write-heavy clusters after migration to Netty framework.

YoungGen GC reduction

10x

Noted in write-heavy clusters post-migration.

Latency reduction for write-heavy clusters

60%

Measured after the migration to the new Netty framework.

Latency reduction for read-heavy clusters

30%

Observed in P99 and max latency after migration.

RCU improvement

100%

Achieved across small to large data sizes for read operations.

WCU improvement

60-100%

Observed on different data sizes for write operations.

Technologies & Tools

Backend

Netty

Used as the new framework for Espresso to enhance performance and capacity.

Key Actionable Insights

1
Implement a new thread model in your applications to enhance performance and reduce contention.
By utilizing thread-local variables and avoiding inter-thread locking, you can significantly improve CPU cache hit rates, which is crucial for high-performance applications.

2
Adopt direct buffer pooling for memory management to reduce garbage collection pressure.
By managing memory allocation directly from the operating system instead of the JVM heap, you can improve application performance and reduce latency, especially in memory-intensive operations.

3
Utilize metrics effectively to measure the impact of system migrations.
Establishing clear performance metrics before and after migration allows for better assessment of improvements and guides future optimizations.

Common Pitfalls

1

Failing to monitor memory usage can lead to undetected memory leaks.

In high-performance applications, neglecting to track memory allocation and deallocation can result in significant performance degradation and system instability.

2

Not implementing proper testing for stress and performance can lead to deployment failures.

Without adequate testing tools and methodologies, the risks of deploying new systems increase, potentially leading to service outages.

Related Concepts

Distributed Systems

Performance Optimization Techniques

Memory Management Strategies

Systems problems are rooted in impossible dreams. Your file system wants to give you infinite, fast, durable storage. Your garbage collector and your kernel’s virtual memory subsystem both strive, in very different ways, to provide the illusion of infinite, fast, volatile memory. The constraints of physical reality make these hopes impossible to realize in every…

JavaObjective-CBabel

10 min read

Includes Code

Has Summary

--

Slack

Advanced

Slack Bug Bounty: Three Years Later

We’ve reached a few big milestones for the Slack Bug Bounty program: it’s our three-year anniversary, and we’ve paid out more than $210,000 in bounties! We want to give a big thank you to all the security researchers who have helped make Slack more secure. In this post we’ll offer a retrospective on our bug…

TypeScriptJavaScriptJava

11 min read

Has Summary

--

These articles from Slack and other leading engineering teams share similar topics with "Improving performance and capacity for Espresso with new Netty framework". Explore more engineering insights on JavaScript, Java, Objective-C.