Data Engineers of Netflix — Interview with Kevin Wylie

Netflix Technology Blog
6 min readintermediate
--
View Original

Overview

The article features an interview with Kevin Wylie, a Data Engineer at Netflix, discussing his journey in data engineering, his experiences at Netflix, and the evolution of his role over the past decade. It highlights the technical challenges faced in building the entertainment knowledge graph and the importance of data engineering in supporting analytics.

What You'll Learn

1

How to design and build a knowledge graph for content analytics

2

Why understanding entity resolution is crucial in data engineering

3

When to transition from a management role to a hands-on engineering role

Prerequisites & Requirements

  • Basic understanding of data engineering concepts
  • Experience with SQL and big data technologies(optional)

Key Questions Answered

What motivated Kevin Wylie to pursue a career in data engineering?
Kevin Wylie initially stumbled into data engineering from application development. His interest grew as he deepened his knowledge of big data, particularly during his time at MySpace, where he experienced data warehousing at internet scale.
How has Kevin's role evolved at Netflix over the years?
Kevin's role has evolved from being part of a small analytics team to handling a more mature business with diverse analytics stakeholders. He transitioned from management back to hands-on engineering to create impactful data products.
What technical challenges did Kevin face while building the knowledge graph?
Building the knowledge graph involved challenges such as entity resolution, which determines if different movie names in various languages refer to the same entity, and implementing distributed graph algorithms using Spark.
What attracted Kevin to work at Netflix?
Kevin was drawn to Netflix due to the opportunity to work with large-scale data, the company's culture of trust, and the alignment of his personal interests in movies and TV shows with his professional work in analytics.

Key Statistics & Figures

Years at Netflix
10 years
Kevin Wylie has been at Netflix since 2011, contributing significantly to the evolution of the data engineering team.
Number of countries served by Netflix
190 countries
Netflix has grown to become a leading streaming service, serving members globally.
Size of the initial content analytics team
3 people
When Kevin joined, the content analytics team was very small, highlighting the company's growth over the years.

Technologies & Tools

Some links below are affiliate links. We may earn a commission if you make a purchase.

Backend
Spark
Used for implementing distributed graph algorithms in the knowledge graph project.
Backend
Hadoop
Previously tested in high-scale areas before being replaced by Spark.
Backend
Ab Initio
An ETL tool used in the early days of Netflix's data engineering.

Key Actionable Insights

1
Focus on building a knowledge graph to enhance content analytics capabilities.
Creating a knowledge graph can reveal insights that are otherwise hidden, enabling better understanding of trends in movies and shows, which is crucial for data-driven decision-making.
2
Emphasize the importance of entity resolution in data engineering projects.
Entity resolution is vital for ensuring data accuracy and consistency, especially in projects involving multilingual datasets or diverse data sources.
3
Consider transitioning back to hands-on roles to maintain engagement and impact.
As organizations grow, engineers may find fulfillment in directly creating data products that empower analytics teams, rather than solely managing others.

Common Pitfalls

1
Underestimating the complexity of entity resolution in data projects.
Many data engineers may overlook the challenges of ensuring data consistency across different languages and formats, which can lead to inaccurate analytics results.

Related Concepts

Data Engineering
Content Analytics
Knowledge Graphs
Entity Resolution
Big Data Technologies