Scaling Media Machine Learning at Netflix

Netflix Technology Blog

Netflix

•

Netflix Technology Blog

•12 min read•intermediate•

--

•View Original

CassandraJavaMachine LearningScalaSpringSpring Boot

Overview

The article discusses Netflix's efforts to scale its media machine learning infrastructure, focusing on the challenges faced by media ML practitioners and the solutions developed to optimize and streamline processes. Key components include media access, feature storage, compute orchestration, and training performance enhancements.

What You'll Learn

1

How to standardize media assets for machine learning applications

2

Why using a feature store can reduce redundant computations in ML pipelines

3

How to implement triggering and orchestration for media ML workflows

Prerequisites & Requirements

Understanding of machine learning concepts and media processing
Familiarity with Python and ML frameworks like Ray(optional)

Key Questions Answered

What challenges do media ML practitioners face at Netflix?

Media ML practitioners at Netflix face challenges such as accessing and processing diverse media data, training large-scale models efficiently, and productizing models in a self-serve manner. These challenges necessitate a robust infrastructure to streamline workflows and reduce redundancy.

How does Netflix optimize its media ML infrastructure?

Netflix optimizes its media ML infrastructure by standardizing media access through Jasper, creating a feature store with Amber, and implementing a compute orchestration system. This allows for efficient model training and reduces redundant computations across various ML pipelines.

What is the role of the Amber Feature Store?

The Amber Feature Store at Netflix is designed to memoize features and embeddings tied to media entities, promoting reuse and reducing the computational cost associated with media feature computation. It also supports data replication for different storage solutions based on access patterns.

What is match cutting and how is it implemented at Netflix?

Match cutting is a video editing technique that transitions between shots using similar visual framing. At Netflix, it is implemented using a media ML infrastructure that automates the identification of matching shots across titles, enhancing the editing process.

Key Statistics & Figures

Training system throughput increase

3–5 times

This improvement was achieved by optimizing data loading and utilizing a high-performance file system.

Comparisons needed for match cutting across titles

200 trillion computations

This figure illustrates the computational intensity involved when matching shots across multiple titles.

Technologies & Tools

Some links below are affiliate links. We may earn a commission if you make a purchase.

Backend

Ray

Used for building a large-scale GPU training cluster that supports multi-GPU and multi-node distributed training.

Backend

Amber

Provides a feature store for memoizing features and orchestrating compute tasks.

Backend

Jasper

Facilitates standardized access to media data for ML practitioners.

Backend

Marken

A scalable service used to persist feature values as annotations associated with Netflix media entities.

Key Actionable Insights

1
Standardizing media assets can significantly enhance the efficiency of machine learning workflows.
By ensuring that all media files adhere to a consistent format, practitioners can reduce the time spent on preprocessing and improve the quality of model training.

2
Utilizing a feature store like Amber can minimize redundant computations across different ML pipelines.
This approach not only saves computational resources but also ensures consistency in feature representation across various models.

3
Implementing robust triggering and orchestration mechanisms can streamline the processing of new media assets.
By automating these processes, teams can respond more quickly to incoming data, allowing for faster turnaround times in content production.

Common Pitfalls

1

Failing to standardize input file formats can lead to inconsistencies in model performance.

Without a consistent format, the quality of feature representations may vary, resulting in poor matching quality across different titles.

2

Redundant computations can waste resources and slow down ML pipelines.

Many ML practitioners independently compute the same features, which can be avoided by using a shared feature store.

Related Concepts

Machine Learning In Media Processing

Feature Engineering And Storage

Video Editing Techniques