Flaky tests are a well-known problem across the industry and Microsoft is no exception. In this post, Suresh Thummalapenta walks us through the team's comprehensive flaky test management system that helps to infer, triage, and quarantine those tests.
Overview
The article discusses the challenges posed by flaky tests in software development and presents a comprehensive flaky test management system developed by Microsoft to enhance developer productivity. It details the system's phases of inference, reporting, and mitigation, which collectively aim to identify, manage, and suppress flaky tests effectively.
What You'll Learn
How to identify flaky tests using telemetry data
Why managing flaky tests is crucial for developer productivity
When to quarantine tests to prevent misleading failures
Key Questions Answered
What are flaky tests and why are they problematic?
How does Microsoft's flaky test management system work?
What results has the flaky test management system achieved?
Key Statistics & Figures
Technologies & Tools
Key Actionable Insights
1Implement a flaky test management system to enhance developer productivity.By effectively identifying and managing flaky tests, teams can reduce the time spent on false failures, allowing developers to focus on actual issues and improving overall workflow.
2Utilize telemetry data to monitor test executions and identify flaky tests.Monitoring telemetry can provide insights into test reliability, enabling teams to proactively address flaky tests before they disrupt the development process.
3Establish policies for handling flaky tests, such as blocking PRs for developers with excessive flaky test bugs.This can encourage accountability among developers and promote a culture of quality in testing practices.