Our Experience Porting the YJIT Ruby Compiler to Rust

In this post, I want to give a nuanced perspective on our experience porting YJIT from C to Rust. I'll talk about the positives, but also discuss the things that we found challenging or suboptimal in our experience.

Maxime Chevalier-Boisvert
19 min readadvanced
--
View Original

Overview

This article details the experience of porting the YJIT Ruby compiler from C to Rust at Shopify, highlighting the challenges and advantages encountered during the transition. It discusses the complexities of compiler design, the benefits of Rust's features, and the lessons learned throughout the porting process.

What You'll Learn

1

How to manage complexity in compiler design using Rust

2

Why Rust's type safety is beneficial for systems programming

3

How to utilize Rust's pattern matching and macros effectively

4

When to use unsafe blocks in Rust for interfacing with C code

Prerequisites & Requirements

  • Understanding of compiler design principles
  • Familiarity with C and Rust programming languages

Key Questions Answered

What challenges did the YJIT team face while porting to Rust?
The YJIT team faced challenges such as managing complexity, integrating with the existing CRuby codebase, and navigating Rust's strict type system. They also encountered difficulties with tools like Bindgen and Cargo, particularly in handling C interop and ensuring offline builds.
How does Rust improve upon C in compiler development?
Rust offers strong type safety, better memory management, and features like pattern matching and macros, which help reduce bugs and improve code maintainability. These features make it easier to manage the complexity inherent in compiler design compared to C.
What is the significance of using unsafe blocks in Rust?
Unsafe blocks in Rust are necessary for operations that involve raw pointers or interfacing with C code. They allow developers to bypass some of Rust's safety guarantees, which is crucial when working with low-level systems programming, but they also introduce potential risks if not used carefully.
What are the benefits of Rust's macro system compared to C's preprocessor?
Rust's macro system is safer and more ergonomic than C's preprocessor macros. It allows for intuitive code generation while leveraging Rust's pattern matching capabilities, making it easier to write and maintain complex code.

Key Statistics & Figures

Lines of C code in YJIT
11,000
YJIT is described as a relatively simple JIT compiler totaling about 11,000 lines of C code.
Age of CRuby codebase
30 years
The CRuby codebase is noted to be close to 30 years old, which presents challenges in integration.
Duration of the porting effort
3 months
The YJIT team completed the port from C to Rust in three months.

Technologies & Tools

Some links below are affiliate links. We may earn a commission if you make a purchase.

Programming Language
Rust
Used to implement the YJIT compiler for better complexity management.
Programming Language
C
Original language of the YJIT compiler before porting to Rust.
Build System
Cargo
Rust's package manager and build system used for managing the YJIT project.
Tool
Bindgen
Used for automatically exporting C definitions to Rust.

Key Actionable Insights

1
Utilize Rust's pattern matching and macro capabilities to simplify complex code structures.
By leveraging these features, developers can write cleaner and more maintainable code, reducing the likelihood of bugs and improving overall code quality.
2
Consider the trade-offs of using unsafe blocks when interfacing with C code in Rust.
While unsafe blocks are necessary for certain operations, excessive use can lead to code that is difficult to maintain and understand. Aim to minimize their usage by encapsulating unsafe code in safe abstractions.
3
Take advantage of Rust's strong type system to manage complexity in compiler development.
Using Rust's type safety features can help prevent common programming errors, making it easier to maintain and extend compiler projects over time.

Common Pitfalls

1
Overusing unsafe blocks can lead to code that is difficult to maintain and understand.
While unsafe blocks are necessary for certain low-level operations, relying on them too heavily can introduce risks and make the codebase more error-prone.
2
Failing to leverage Rust idioms during the port can result in less efficient and harder-to-read code.
The article notes that the team refrained from making major architectural changes during the port, which limited their ability to fully utilize Rust's strengths.

Related Concepts

Compiler Design Principles
Systems Programming Languages
Interfacing Rust With C
Memory Management In Rust