Accelerating Services at Airbnb by Building — and Open Sourcing — a Blazing Fast Thrift Binding for Ruby

Introducing Sparsam, Airbnb’s implementation of Thrift Serializer in Ruby. Sparsam is up to 25x faster than the old Thrift Ruby Binding in…

Andy Fang
7 min readintermediate
--
View Original

Overview

The article discusses Airbnb's development and open-sourcing of Sparsam, a fast Thrift binding for Ruby that significantly improves serialization and deserialization speeds. Sparsam achieves up to 25x faster serialization and 8x faster deserialization compared to the previous Ruby Thrift binding, enhancing performance for Airbnb's services.

What You'll Learn

1

How to optimize serialization performance in Ruby applications

2

Why caching schema information can enhance serialization speed

3

How to implement effective data structures for performance in Ruby

Prerequisites & Requirements

  • Understanding of serialization protocols like Thrift
  • Familiarity with Ruby and performance optimization techniques

Key Questions Answered

What are the performance improvements achieved by Sparsam?
Sparsam achieves up to 25x faster serialization and 8x faster deserialization compared to the old Ruby Thrift Binding. This significant performance enhancement allows Airbnb to transition more endpoints from legacy JSON to Thrift without compromising speed.
How does Sparsam avoid costly string allocations?
Sparsam minimizes costly string allocations by caching interned IDs of strings and storing fields in a hash map. This approach reduces the overhead associated with creating and interning Ruby strings during deserialization, resulting in a 25% speedup.
What validation modes does Sparsam provide for Thrift structs?
Sparsam offers two additional validation modes beyond the default required field checks: 'strict', which checks field types, and 'recursive', which checks types in nested structs. This enhances data integrity during serialization.
Why is eliminating cross-language function calls important for performance?
Eliminating excessive cross-language function calls is crucial because they introduce overhead, especially when calling Ruby functions from C. Sparsam minimizes these calls, enhancing serialization performance by avoiding Ruby VM dispatch.

Key Statistics & Figures

Serialization speed improvement
25x
Sparsam's serialization speed is up to 25 times faster than the old Ruby Thrift Binding.
Deserialization speed improvement
8x
Sparsam achieves up to 8 times faster deserialization compared to the previous implementation.

Technologies & Tools

Some links below are affiliate links. We may earn a commission if you make a purchase.

Backend
Apache Thrift
Used as the serialization protocol for communication between services at Airbnb.
Backend
C++
Utilized for performance-critical components of Sparsam, particularly in caching schema information.

Key Actionable Insights

1
Implement caching strategies for schema information to improve serialization speed.
By caching schema information in C++ containers, you can significantly reduce the overhead of accessing struct definitions, leading to faster serialization processes.
2
Utilize instance variables instead of hash maps for storing deserialized data.
This approach reduces the need for hash lookups and reallocations, resulting in improved performance when accessing fields in Ruby applications.
3
Focus on minimizing cross-language calls in performance-critical applications.
Handling as much logic as possible within C functions can greatly enhance performance, especially in dynamic languages like Ruby where VM calls are costly.

Common Pitfalls

1
Relying too heavily on dynamic dispatch in Ruby can lead to performance bottlenecks.
This occurs because dynamic dispatch incurs overhead, especially when handling large or deeply nested structures. Avoiding reliance on Ruby VM for method resolution can mitigate this issue.

Related Concepts

Performance Optimization Techniques In Ruby
Serialization Protocols And Their Implementations
Best Practices For Using C Extensions In Ruby