The third in a series on building services architecture, this article looks at how we built resilience engineering practices into the…
Overview
In this article, the authors discuss resilience engineering practices integrated into Airbnb's service platform, which supports their service-oriented architecture. They highlight the importance of resilience as a requirement, not just a feature, and share various strategies implemented to enhance service availability and performance.
What You'll Learn
How to implement asynchronous request processing in Java services
Why request queuing is essential for handling burst traffic
How to apply load shedding techniques to prevent service overload
When to use dependency isolation to enhance service resilience
Prerequisites & Requirements
- Understanding of service-oriented architecture and resilience engineering concepts
- Familiarity with Java and Dropwizard framework(optional)
Key Questions Answered
What resilience engineering practices are implemented at Airbnb?
How does Airbnb handle service overload and prevent cascading failures?
What is the impact of resilience on service availability at Airbnb?
What role does request queuing play in service performance?
Key Statistics & Figures
Technologies & Tools
Some links below are affiliate links. We may earn a commission if you make a purchase.
Key Actionable Insights
1Implement asynchronous request processing to enhance throughput and resource utilization in your services.Asynchronous processing allows services to handle more concurrent requests without blocking I/O threads, which is particularly beneficial during traffic spikes.
2Utilize request queuing techniques to manage burst traffic effectively.By applying a controlled delay queue, services can prevent overload and maintain performance during high demand periods.
3Adopt load shedding strategies to protect services from excessive load.Implementing service back pressure and client quota-based rate limiting can help maintain service stability and prevent cascading failures during traffic surges.
4Use dependency isolation to mitigate the impact of problematic downstream services.By isolating dependencies, services can continue to function even if one or more downstream services experience issues, thereby enhancing overall resilience.