Under the Hood of Uber ATG’s Machine Learning Infrastructure and Versioning Control Platform for Self-Driving Vehicles

Overview

The article discusses Uber ATG's machine learning infrastructure and versioning control platform, VerCD, designed to manage the complexities of developing self-driving vehicles. It outlines the five-step life cycle of machine learning models, the various components involved, and how VerCD facilitates continuous integration and delivery (CI/CD) for efficient model management.

What You'll Learn

1

How to implement a five-step life cycle for machine learning models

2

Why continuous delivery is crucial for managing ML artifacts

3

How to automate data ingestion and validation processes

Prerequisites & Requirements

  • Understanding of machine learning concepts and workflows
  • Familiarity with CI/CD tools like Jenkins(optional)

Key Questions Answered

What is the purpose of VerCD in Uber ATG's ML workflow?
VerCD is designed to manage versioning and dependencies of machine learning artifacts in Uber ATG's self-driving vehicle development. It tracks all dependencies, including data and model artifacts, ensuring reproducibility and traceability throughout the ML workflow.
How does Uber ATG ensure the quality of its ML models?
Uber ATG employs a five-step life cycle for its ML models, which includes data ingestion, validation, training, evaluation, and serving. This structured approach helps maintain high-quality metrics before deploying models to self-driving vehicles.
What challenges does Uber ATG face with ML model dependencies?
The complexity of deep dependency graphs in the self-driving domain poses significant challenges for continuous delivery. Each model's dependencies can affect others, making it crucial to manage these interactions effectively to avoid inconsistencies.

Key Statistics & Figures

Daily trips supported by Uber
14 million
This statistic highlights the scale at which Uber operates and the importance of efficient ML model management.

Technologies & Tools

Some links below are affiliate links. We may earn a commission if you make a purchase.

Key Actionable Insights

1
Implementing a structured life cycle for ML models can significantly enhance the quality and reliability of deployments.
By following a defined process, teams can ensure that each model is thoroughly validated before deployment, reducing the risk of failures in production.
2
Automating data ingestion and validation processes can streamline model training and improve iteration speed.
This allows engineers to focus on refining models rather than managing data manually, leading to faster development cycles.
3
Utilizing a version control system tailored for ML artifacts can help manage complex dependencies effectively.
This is particularly important in self-driving vehicle development, where changes in one model can impact others, necessitating careful tracking of all components.

Common Pitfalls

1
Failing to track dependencies can lead to inconsistent model performance.
Without proper versioning and dependency management, changes in one model can negatively impact others, leading to unexpected behavior in production.

Related Concepts

Continuous Integration And Continuous Delivery (ci/Cd)
Machine Learning Operations (mlops)
Dependency Management In Software Development