Netflix Video Quality at Scale with Cosmos Microservices

Netflix Technology Blog
8 min readintermediate
--
View Original

Overview

The article discusses how Netflix measures video quality at scale using the Cosmos microservices architecture. It highlights the transition from a monolithic system to a microservices-based approach, focusing on the Video Quality Service (VQS) and its role in enhancing perceptual quality measurements like VMAF.

What You'll Learn

1

How to implement video quality measurements using microservices

2

Why separating video quality computations from encoding improves scalability

3

How to utilize the Video Quality Service (VQS) for perceptual quality metrics

Prerequisites & Requirements

  • Understanding of video encoding and quality metrics
  • Familiarity with microservices architecture(optional)

Key Questions Answered

How does Netflix measure video quality at scale?
Netflix measures video quality at scale by utilizing the Video Quality Service (VQS) within the Cosmos microservices architecture. This service independently computes perceptual quality metrics like VMAF and SSIM, allowing for rapid innovation and deployment of new quality algorithms without re-encoding the entire video catalog.
What are the benefits of using Cosmos for video quality computations?
Cosmos provides several benefits including separation of concerns, independent deployments, and improved observability. This architecture allows for rapid prototyping and productization of video quality innovations, enhancing the overall streaming experience for Netflix users.
What challenges did Netflix face with the Reloaded system?
The Reloaded system's monolithic architecture caused tight coupling between video encoding and quality measurement, making it difficult to roll out new algorithms or maintain data quality without costly re-encoding of the entire catalog. This limitation hindered rapid innovation.
How does the VQS workflow handle quality calculations?
The VQS workflow splits video quality calculations into chunks, allowing for parallel processing. Each chunk's quality is computed independently, and results are assembled to provide comprehensive quality metrics, enhancing throughput and reducing latency.

Technologies & Tools

Backend
Cosmos
Used for workflow-driven, media-centric microservices to enhance video quality computations.
Metric
Vmaf
A perceptual quality measurement standard used to evaluate video quality.
Metric
Ssim
A metric used alongside VMAF to measure video quality.
Tool
Nirvana
An observability portal used to trace the VQS workflow.
Backend
Document Conversion Service (dcs)
Handles data model conversions between Cosmos and Reloaded systems.

Key Actionable Insights

1
Adopt a microservices architecture to decouple video quality measurements from encoding processes.
This approach allows for faster innovation and deployment of new quality algorithms, as seen with Netflix's transition to the Cosmos platform.
2
Implement chunk-based processing in quality calculations to enhance throughput.
By dividing video into chunks, you can compute quality metrics in parallel, significantly reducing latency and improving overall performance.
3
Utilize the Video Quality Service (VQS) for flexible quality metric calculations.
VQS allows for the use of multiple perceptual quality metrics simultaneously, providing a comprehensive view of video quality across different formats and devices.

Common Pitfalls

1
Failing to decouple video quality measurements from encoding can lead to slow innovation.
This tight coupling means any changes in quality algorithms require re-encoding, which is costly and time-consuming, as experienced with the Reloaded system.
2
Not utilizing chunk-based processing can result in increased latency.
Without chunking, quality calculations may take longer to complete, reducing the overall efficiency of the video quality service.

Related Concepts

Microservices Architecture
Perceptual Quality Metrics
Video Encoding Optimization
A/B Testing In Streaming Services