OCI Accelerates HPC, AI, and Database Using RoCE and NVIDIA ConnectX

Oracle Cluster Infrastructure uses an innovative approach to deliver scalable, RDMA-powered networking on Ethernet for a multitude of distributed workloads…

John F. Kim
17 min readadvanced
--
View Original

Overview

The article discusses how Oracle Cloud Infrastructure (OCI) leverages RDMA over Converged Ethernet (RoCE) and NVIDIA ConnectX technology to enhance high-performance computing (HPC), AI, and database workloads. It highlights the importance of low-latency networking and optimized congestion control for achieving high throughput and performance in distributed computing environments.

What You'll Learn

1

How to implement RDMA for high-performance applications in OCI

2

Why RoCE is preferred over InfiniBand for certain workloads

3

When to use explicit congestion notification for network management

4

How to optimize network performance for distributed workloads

Prerequisites & Requirements

  • Understanding of RDMA and network protocols
  • Experience with cloud infrastructure and distributed systems(optional)

Key Questions Answered

What is RDMA and how does it improve network performance?
Remote Direct Memory Access (RDMA) allows data to be transferred between machines without involving the CPU, reducing latency and increasing throughput. This technology is essential for applications like AI and HPC, where performance is critical, as it minimizes the overhead associated with traditional networking methods.
How does OCI implement RoCE for scalable networking?
OCI uses a customized congestion control solution based on the data center quantized congestion notification (DC-QCN) algorithm to manage traffic effectively across thousands of nodes. This implementation ensures low latency and high throughput for workloads requiring RDMA, such as AI and HPC applications.
What are the limitations of priority flow control (PFC) in RoCE networks?
Priority Flow Control (PFC) operates only at Layer 2 and cannot manage congestion across different subnets, leading to potential traffic issues. It can also cause congestion to spread across unrelated traffic flows, making it less effective for large-scale distributed applications.
What advantages does InfiniBand offer compared to Ethernet for HPC?
InfiniBand provides lossless networking optimized for high bandwidth and low latency, making it ideal for HPC and AI workloads. It offloads networking tasks from the CPU, allowing for efficient data movement and enabling faster problem-solving in distributed computing environments.

Key Statistics & Figures

OCI revenue
$4 billion per quarter
This reflects OCI's growth and its ability to support over 22,000 customers.
Number of OCI regions
41 regions
OCI has expanded by adding 11 regions in the last 18 months to support various deployment models.
Latency for small clusters
as low as 2 microseconds
This demonstrates the effectiveness of OCI's optimized network for distributed workloads.
Latency for large clusters
typically under 4 microseconds
This indicates the high performance achievable with OCI's dedicated RoCE network.

Technologies & Tools

Some links below are affiliate links. We may earn a commission if you make a purchase.

Cloud Platform
Oracle Cloud Infrastructure
Provides a scalable environment for running HPC, AI, and database workloads.
Networking
Roce
Enables efficient data transfer over Ethernet networks.
Network Adapter
Nvidia Connectx
Facilitates high-performance networking with RDMA capabilities.

Key Actionable Insights

1
Implementing RDMA can significantly enhance the performance of distributed applications by reducing CPU overhead and improving data transfer speeds.
This is particularly beneficial for workloads that require high throughput and low latency, such as AI and HPC applications, where every millisecond counts.
2
Utilizing a dedicated RoCE network can help manage different types of application traffic more effectively, ensuring optimal performance.
By isolating RDMA traffic from standard data center traffic, OCI can tailor congestion control mechanisms to meet the specific needs of various workloads.
3
Optimizing congestion control profiles based on workload requirements can lead to better resource utilization and performance.
By customizing settings for different types of applications, such as latency-sensitive or throughput-sensitive workloads, organizations can achieve a balance that maximizes efficiency.

Common Pitfalls

1
Relying solely on PFC for congestion management can lead to inefficiencies in large networks.
PFC's limitations in handling congestion across subnets can result in traffic bottlenecks. It's essential to implement more robust congestion control mechanisms like ECN for better scalability.

Related Concepts

Rdma Technology And Its Applications
Congestion Control Mechanisms In Networking
High-performance Computing Architectures