Overview
This article discusses Netflix's implementation of GraphQL Federation, detailing the core infrastructure, developer experience, schema governance, observability, security, and resilience strategies that support their federated GraphQL architecture. It highlights the challenges faced during adoption and the lessons learned throughout the process.
What You'll Learn
1
How to implement a federated GraphQL architecture using Domain Graph Services (DGS)
2
Why effective schema governance is crucial for GraphQL API evolution
3
How to enhance observability in a federated GraphQL system
4
When to apply security best practices in a GraphQL environment
Prerequisites & Requirements
- Understanding of GraphQL and its federation concepts
- Familiarity with Kotlin and Apollo's GraphQL implementation(optional)
Key Questions Answered
What infrastructure is used for Netflix's federated GraphQL architecture?
Netflix's GraphQL Gateway is based on Apollo’s reference implementation and is written in Kotlin. This allows access to Netflix’s Java ecosystem and features like coroutines for efficient parallel fetches, enhancing performance and developer experience.
How does Netflix handle schema governance in GraphQL?
Netflix emphasizes active schema management to ensure schema evolution and health. They engage a Studio Data Architect to establish best practices and encourage a collaborative design approach, allowing UI developers to shape the schema to meet their needs.
What observability strategies does Netflix employ for its GraphQL API?
Netflix prioritizes observability through alerting, discovery, and diagnosis. They integrated their Gateway and DGS components with tools like Zipkin and Edgar to enhance visibility, allowing for better monitoring and quicker resolution of issues.
What security measures are in place for Netflix's GraphQL Gateway?
Netflix restricts access to the GraphQL Gateway to trusted authenticated users and limits Graph Introspection to internal developers. Authorization responsibilities are delegated to DGS owners, ensuring consistent access control across applications.
Key Statistics & Figures
Number of engineers contributing to the API daily
hundreds
This surge in contributions occurred after the transition to a federated GraphQL architecture in April 2020.
Timeline for deprecating the original Studio API monolith
by the end of 2020
This migration was part of the broader strategy to enhance the federated architecture.
Technologies & Tools
Some links below are affiliate links. We may earn a commission if you make a purchase.
Backend
Kotlin
Used for developing the GraphQL Gateway and schema registry.
Backend
Apollo
Reference implementation for the GraphQL Gateway.
Database
Cassandra
Used for storing schema changes with an event sourcing pattern.
Observability
Zipkin
Internal distributed tracing tool integrated for enhanced observability.
CI/CD
Spinnaker
Used for automatically setting up cloud networking for DGSs.
Key Actionable Insights
1Implement a collaborative schema design process to enhance the quality of your GraphQL API.Engaging multiple teams in schema design can lead to a more robust API that meets diverse needs, reducing future rework and improving overall satisfaction.
2Prioritize observability in your GraphQL architecture to quickly identify and resolve issues.By integrating tools for alerting and tracing, you can significantly reduce mean time to resolution (MTTR) and enhance the developer experience.
3Establish a clear security framework for your GraphQL APIs to maintain consistent authorization practices.Delegating authorization to DGS owners can streamline access control and ensure that all applications adhere to the same security standards.
Common Pitfalls
1
Failing to achieve alignment across teams can hinder the adoption of a federated architecture.
Initial skepticism and dissent can arise when introducing new concepts. Addressing these concerns proactively is crucial for successful implementation.
2
Neglecting observability can lead to prolonged downtimes and unresolved issues.
Without proper monitoring and alerting systems, identifying and diagnosing problems in a complex architecture becomes challenging, impacting user experience.
Related Concepts
Graphql Federation
Domain Graph Services (dgs)
Schema Governance
Observability In Apis