Behind the Streams: Three Years Of Live at Netflix. Part 1.

Netflix Technology Blog
11 min readintermediate
--
View Original

Overview

The article discusses Netflix's journey in implementing live streaming technology over the past three years, detailing the architectural decisions, challenges, and innovations that have led to successful live events. It highlights the unique requirements of live streaming compared to traditional on-demand content and outlines the key pillars of Netflix's live architecture.

What You'll Learn

1

How to architect a live streaming solution that scales to millions of viewers

2

Why using HTTPS-based Live Streaming is beneficial for device compatibility

3

How to implement cloud-based transcoding and packaging for live events

4

When to apply automated testing for live streaming systems

Prerequisites & Requirements

  • Understanding of live streaming technologies and architectures
  • Experience with cloud services and CDN management(optional)

Key Questions Answered

What are the key pillars of Netflix's live streaming architecture?
Netflix's live streaming architecture is built on several key pillars including dedicated broadcast facilities for content ingestion, cloud-based transcoding and packaging pipelines, and scaling content delivery through Open Connect CDN. These elements ensure high-quality streaming experiences for millions of viewers across various devices.
How does Netflix ensure live content delivery to millions of viewers?
Netflix utilizes its Open Connect CDN, which consists of over 18,000 servers in more than 6,000 locations worldwide. This infrastructure allows for efficient scaling and delivery of live content, ensuring that it reaches viewers with minimal latency and high quality.
What challenges did Netflix face in implementing live streaming?
Netflix faced challenges such as the need for rapid execution within nine months, ensuring compatibility across devices, and maintaining high quality without interruptions. They also had to build a robust architecture that could handle the unique demands of live events compared to on-demand content.
What strategies does Netflix use for real-time metrics during live events?
Netflix employs a mix of internally developed tools and open-source technologies to collect real-time metrics during live events. This includes monitoring system performance and user experience, processing up to 38 million events per second to ensure a high-quality viewing experience.

Key Statistics & Figures

Number of servers in Open Connect CDN
18,000+
These servers are strategically placed in over 6,000 locations to optimize content delivery to viewers.
Events processed per second during live events
38 million
This high processing capability ensures real-time monitoring and responsiveness to user experience during live broadcasts.
Time to launch first live event
9 months
This timeframe highlights the rapid development and deployment efforts required to establish Netflix's live streaming capabilities.

Technologies & Tools

Some links below are affiliate links. We may earn a commission if you make a purchase.

Cloud Service
AWS Elemental Mediaconnect
Used to acquire live feeds in the cloud for transcoding.
Cloud Service
AWS Elemental Medialive
Facilitates the transcoding of live video streams into various quality levels.
CDN
Open Connect
Netflix's content delivery network that supports live streaming.
Data Processing
Kafka
Utilized for real-time metrics collection during live events.
Data Processing
Druid
Used for real-time insights to ensure a high-quality experience.

Key Actionable Insights

1
Implement cloud-based transcoding to ensure flexibility and scalability in live streaming.
Using cloud services like AWS Elemental MediaLive allows for dynamic scaling and easy integration with existing content management systems, making it essential for handling varying viewer loads during live events.
2
Utilize HTTPS for live streaming to maximize device compatibility.
HTTPS is widely supported across devices, ensuring that a larger audience can access live streams without the need for hardware upgrades, which is crucial for reaching diverse viewer demographics.
3
Conduct regular load testing and failure injection to prepare for live events.
By simulating high traffic and potential failures, Netflix can identify weaknesses in their systems and improve resilience, which is vital for maintaining service quality during peak viewership.

Common Pitfalls

1
Underestimating the complexity of live event traffic compared to on-demand traffic.
Live events can generate unpredictable spikes in traffic that are not present in on-demand scenarios. This can lead to system overloads if not properly anticipated and managed.
2
Neglecting to implement robust monitoring and alerting systems.
Without real-time monitoring, issues during live events can go unnoticed, leading to degraded user experiences. It's crucial to have systems in place that can alert teams to problems as they arise.

Related Concepts

Live Streaming Architectures
Cloud-based Transcoding Solutions
Content Delivery Networks (cdns)
Real-time Metrics And Monitoring