Scaling storage in the datacenter with RoCE

Ishan Shah
10 min readadvanced
--
View Original

Overview

The article discusses the implementation of Remote Direct Memory Access (RDMA) over Converged Ethernet version 2 (RoCEv2) in LinkedIn's data centers to enhance storage scalability and performance. It highlights the challenges of under-utilization of NVMe drives and how RDMA enables efficient storage management with low latency and high throughput.

What You'll Learn

1

How to implement RDMA over Converged Ethernet version 2 (RoCEv2) for scalable storage solutions

2

Why a lossless Ethernet network is essential for high-performance storage capabilities

3

When to leverage Priority Flow Control (PFC) for guaranteed lossless networking

Prerequisites & Requirements

  • Understanding of RDMA and RoCE technologies
  • Familiarity with SONiC and network configuration(optional)
  • Experience with data center networking and storage systems

Key Questions Answered

How does RDMA improve storage performance in data centers?
RDMA allows for remote access to memory without CPU interference, enabling high throughput and low latency. This technology helps disaggregate NVMe flash storage from compute resources, allowing for independent scaling and improved data protection, which is crucial in hyperscale environments.
What are the prerequisites for implementing RoCEv2 in a data center?
To implement RoCEv2, it is essential to configure a lossless Ethernet network, trust L3 DSCP markings, and ensure that only storage traffic uses RoCEv2. Additionally, lossless buffer configuration and support for Data Center Quantized Congestion Notification (DCQCN) are necessary.
What impact has RDMA had on LinkedIn's storage costs and performance?
LinkedIn's use of RoCEv2 has been projected to reduce storage cost-basis by 60% while increasing NVMe flash utilization to over 75%. This technology allows for network disks to deliver performance comparable to locally attached disks, ensuring efficient resource usage.

Key Statistics & Figures

Projected reduction in storage cost-basis
60%
This reduction is expected over the next three years due to the implementation of RDMA.
Expected NVMe flash utilization
over 75%
This utilization rate is projected as a result of disaggregating NVMe flash storage.
Read load achieved during testing
2.1 gigabytes per second
This performance was observed with RDMA enabled, demonstrating the technology's effectiveness in high-throughput scenarios.

Technologies & Tools

Networking
Remote Direct Memory Access (rdma)
Used to enable high-performance storage capabilities with low latency.
Networking
Rocev2
A protocol that allows RDMA over commodity Ethernet fabrics.
Networking
Sonic
The switch operating system modified for LinkedIn's infrastructure requirements.

Key Actionable Insights

1
Implementing RDMA can significantly enhance storage performance and scalability in data centers.
By disaggregating NVMe storage from compute resources, organizations can scale their storage independently, leading to better resource utilization and reduced costs.
2
Configuring a lossless Ethernet network is crucial for maximizing RDMA benefits.
Ensuring that all applications share the same links and properly configuring DSCP markings can prevent congestion and maintain high performance.
3
Utilizing PFC can guarantee lossless networking for RDMA traffic.
By implementing PFC, data centers can avoid packet loss during congestion, ensuring that critical storage operations remain uninterrupted.

Common Pitfalls

1
Failing to configure a lossless Ethernet network can lead to congestion and packet loss.
Without proper configuration, RDMA traffic may not perform optimally, resulting in degraded service quality and inefficient resource utilization.
2
Not leveraging DSCP markings can cause traffic to be mismanaged.
If DSCP markings are not trusted, RoCEv2 traffic may not be prioritized correctly, leading to performance issues during high-load scenarios.

Related Concepts

Data Management
Infrastructure
Networking Technologies
Quality Of Service (qos)