Overview
The article discusses the implementation of a Policy Simulator at Uber to enhance the safety and determinism of Identity and Access Management (IAM) policy changes. It highlights the challenges faced with IAM policies, the architecture of the simulator, and its integration into Uber's Unified Security Console.
What You'll Learn
1
How to utilize a Policy Simulator to assess IAM policy changes
2
Why incremental IAM policy deployment may not suffice for safety
3
When to implement a centralized IAM policy management system
Prerequisites & Requirements
- Understanding of IAM policies and their impact on system operations
- Familiarity with Uber's Unified Security Console(optional)
Key Questions Answered
What are the risks associated with IAM policy changes at Uber?
IAM policy changes can lead to significant outages if not managed correctly, as evidenced by an incident where a policy change prevented Uber Eats customers from modifying orders. This highlights the critical need for robust testing and safeguards that are often lacking in IAM policy changes compared to code changes.
How does Uber's Policy Simulator improve IAM policy management?
Uber's Policy Simulator allows policy authors to preview the impact of proposed changes in real time, enabling them to understand the exact effects before deployment. This proactive approach helps mitigate risks of unintended outages and overly permissive authorizations.
What are the stages involved in the policy simulation process?
The policy simulation process consists of two stages: fetching access logs from Policy Enforcement Points (PEPs) and replaying these logs on two authorization engines to compare current and proposed policy impacts. This helps identify potential access control changes before they are applied.
How does Uber's Policy Simulator compare to AWS and GCP IAM policy simulators?
While AWS focuses on validating existing policies and GCP emphasizes previewing impacts of proposed changes, Uber's Policy Simulator integrates directly into the Unified Security Console, allowing for streamlined testing of policy changes within its production environment.
Key Statistics & Figures
Percentage of IAM policy changes that are grant removals
10%
Approximately 10% of policy changes involve grant removals, which can lead to outages if not executed correctly.
Days of access logs stored for fine-grained access
30 days
Uber stores the last 30 days of access logs for fine-grained access control evaluations.
Days of access logs stored for coarse-grained access
90 days
For coarse-grained access, Uber retains the last 90 days of access logs.
Technologies & Tools
Some links below are affiliate links. We may earn a commission if you make a purchase.
Database
Apache Hive
Used for storing access logs that are later ingested for policy simulation.
Database
Apache Pinot
Provides low-latency access to logs for real-time policy simulation.
Workflow Orchestration
Cadence
Used for managing the workflow of policy simulations to ensure reliability and fault tolerance.
Key Actionable Insights
1Implement a Policy Simulator to enhance IAM policy safety.By integrating a Policy Simulator, organizations can proactively assess the impact of IAM policy changes, reducing the risk of outages and ensuring compliance with security standards.
2Adopt a centralized IAM policy management system.Centralizing IAM policy management can streamline processes, improve oversight, and facilitate better coordination among teams, ultimately leading to more secure and efficient operations.
3Utilize incremental deployment strategies cautiously.While incremental deployment can mitigate risks, it is essential to ensure that rollback triggers and traffic patterns are properly configured to avoid widespread outages from misconfigured policies.
Common Pitfalls
1
Neglecting the testing of IAM policy changes can lead to significant outages.
Without proper testing and safeguards, policy changes can inadvertently disrupt service, as seen in the Uber Eats incident. It is crucial to implement robust testing mechanisms to prevent such issues.
Related Concepts
IAM Policies
Policy Simulation
Access Control Mechanisms
Centralized Policy Management