Genie 2.0: Second Wish Granted!

Netflix Technology Blog

Netflix

•

Netflix Technology Blog

•9 min read•intermediate•

--

•View Original

AWSDockerJPASpringXML

Overview

The article discusses the evolution of Genie 2.0, a distributed job and resource management tool at Netflix, which enhances flexibility and extensibility compared to its predecessor, Genie 1.0. Key improvements include a new data model, flexible job execution environment selection, and richer API support, enabling better integration with modern big data technologies.

What You'll Learn

1

How to implement a flexible job execution environment using Genie 2.0

2

Why a generic data model is essential for multi-tenant distributed processing

3

How to leverage tags for cluster and command resolution in Genie 2.0

Prerequisites & Requirements

Understanding of distributed job management concepts
Familiarity with big data technologies like Hadoop and Spark(optional)

Key Questions Answered

What are the main improvements in Genie 2.0 compared to Genie 1.0?

Genie 2.0 introduces a generic data model, flexible job execution environment selection, and richer API support. These enhancements allow it to work with multiple processing clusters like Hadoop 2, Spark, and Presto, addressing the limitations of Genie 1.0, which was restricted to Hadoop 1 and had a fixed data model.

How does Genie 2.0 select the execution environment for jobs?

Genie 2.0 uses a flexible method to select the execution environment by allowing job requests to specify command and cluster tags. This enables prioritization and fallback options for cluster selection, ensuring jobs are executed on available resources efficiently.

What technologies are integrated into Genie 2.0?

Genie 2.0 integrates with technologies such as Hadoop 2, Spark, Presto, and Docker, facilitating a modern big data platform. This integration allows for better resource management and supports a wider range of job types compared to its predecessor.

What is the significance of the new data model in Genie 2.0?

The new data model in Genie 2.0 allows jobs to run on any multi-tenant distributed processing cluster, enhancing flexibility. It includes entities like Cluster, Command, Application, and Job, each supporting tags for better metadata management and resolution.

Key Statistics & Figures

Number of AWS instances used in production

12 to 20

Genie 2.0 currently autoscales between twelve to twenty i2.2xlarge AWS instances, allowing several hundred jobs to run simultaneously.

Number of tests added to the Genie codebase

600

Almost six hundred tests have been added to improve reliability and maintainability of the Genie codebase.

Technologies & Tools

Some links below are affiliate links. We may earn a commission if you make a purchase.

Backend

Hadoop 2

Used as an execution engine for job submissions.

Backend

Spark

Integrated as a processing engine within the Genie framework.

Backend

Presto

Utilized for interactive query execution in the big data platform.

Tools

Docker

Changing how applications are managed and deployed.

Key Actionable Insights

1
Utilize Genie 2.0's flexible job execution environment to optimize resource allocation during peak loads.
By leveraging the ability to specify command and cluster tags, you can ensure that jobs are routed to the most appropriate resources, improving efficiency and reducing wait times.

2
Adopt the new data model to facilitate integration with various big data tools.
Implementing the generic data model allows for seamless job submissions across different processing engines, making it easier to adapt to evolving technology landscapes.

3
Take advantage of Genie 2.0's richer API support for better automation and integration.
With fine-grained APIs, you can automate job submissions and resource management more effectively, reducing manual overhead and increasing operational efficiency.

Common Pitfalls

1

Failing to leverage the flexible job execution environment can lead to inefficient resource usage.

Without utilizing the tagging system for command and cluster selection, jobs may end up queued on less optimal resources, increasing execution time and operational costs.

2

Not updating to the new data model can hinder integration with modern tools.

Sticking with the old data model limits the ability to run jobs across different processing engines, which can stifle innovation and adaptability in a rapidly changing tech landscape.

Related Concepts

Distributed Job Management

Big Data Technologies

Job Execution Environments

API Design And Implementation