Overview
The article discusses the implementation of dark canary clusters at LinkedIn, a technique for safely testing new code in production environments without impacting end users. It highlights the benefits of this approach in improving developer productivity and system stability while managing the complexities of modern CI/CD pipelines.
What You'll Learn
1
How to implement dark canary clusters for safe production testing
2
Why dark canaries are essential for validating new code in production environments
3
When to use dark canaries versus traditional testing methods
Prerequisites & Requirements
- Understanding of CI/CD pipelines and production testing
- Familiarity with Apache ZooKeeper and Rest.li(optional)
Key Questions Answered
What is a dark canary and how does it work?
A dark canary is an instance of a service that runs new code at production level but discards responses, ensuring that end users are not affected by any issues. It allows developers to validate changes without impacting customer experience.
How did LinkedIn implement dark canary clusters?
LinkedIn introduced dark canary clusters by using the Dynamic Discovery (D2) service to send duplicated traffic to multiple dark canary instances, allowing for effective testing of new code while minimizing operational complexity.
What are the benefits of using dark canary clusters?
Dark canary clusters allow for substantial traffic testing without impacting user experience, improve developer productivity, and help maintain system stability by validating changes in a controlled manner.
What challenges did LinkedIn face with dark canaries?
Challenges included operational complexity in maintaining dark canaries, such as managing host-to-host mappings and ensuring traffic duplication without impacting performance or incurring extra costs.
Technologies & Tools
Backend
Apache Zookeeper
Used for managing host-to-host mappings and service discovery in dark canary implementations.
Backend
Rest.li
Framework used to support dark canary functionality in LinkedIn's services.
Key Actionable Insights
1Implement dark canary clusters to enhance your production testing strategy.This approach allows for safe validation of new code without affecting end users, making it particularly useful in fast-paced development environments.
2Utilize service discovery mechanisms like D2 to manage dark canary traffic effectively.By integrating with existing service discovery tools, you can streamline the process of sending duplicated requests to dark canaries, reducing operational overhead.
3Consider the implications of downstream service calls when using dark canaries.Ensure that any requests made by dark canaries do not have side effects that could impact system performance or incur unnecessary costs.
Common Pitfalls
1
Failing to manage side effects of requests made by dark canaries can lead to unexpected costs and performance issues.
It's crucial to ensure that requests do not inadvertently affect downstream services or business metrics, which can complicate the testing process.