Three Approaches to Scaling Machine Learning with Uber Seattle Engineering

Overview

The article discusses three innovative approaches to scaling machine learning at Uber, specifically from the Seattle Engineering team. It highlights the use of Horovod for distributed deep learning, Michelangelo for ML-as-a-service, and Pyro for probabilistic programming, showcasing their significance in optimizing Uber's operations and contributing to the broader tech community.

What You'll Learn

1

How to use Horovod for distributed deep learning with Apache Spark

2

Why Michelangelo simplifies the machine learning process

3

How to model rider behavior using Pyro

Prerequisites & Requirements

  • Understanding of machine learning concepts
  • Familiarity with Apache Spark(optional)

Key Questions Answered

How does Horovod facilitate distributed deep learning?
Horovod is an open source deep learning framework that enables distributed training across hundreds of machines. It abstracts the complexities of infrastructure, allowing ML engineers to focus on their work without conflicts, making it a preferred choice for major companies like NVIDIA and Amazon.
What is Michelangelo and how does it work?
Michelangelo is Uber's ML-as-a-service platform that simplifies machine learning workflows through its flexible learners and transformers. It integrates seamlessly with Apache Spark and offers an intuitive interface that enhances user experience while providing powerful insights through its dashboard.
How can Pyro be used for modeling rider behavior?
Pyro is a probabilistic programming language that allows data scientists to model and predict rider behavior effectively. It supports recursive simulation capabilities, making it suitable for complex modeling tasks, such as those required for Uber's rewards programs.

Technologies & Tools

Some links below are affiliate links. We may earn a commission if you make a purchase.

Framework
Horovod
Used for distributed deep learning to scale machine learning across multiple machines.
Platform
Michelangelo
Uber's ML-as-a-service platform that simplifies machine learning workflows.
Programming Language
Pyro
A probabilistic programming language for modeling and predicting user behavior.
Framework
Apache Spark
Used in conjunction with Horovod and Michelangelo for processing large datasets.

Key Actionable Insights

1
Utilizing Horovod can significantly enhance your machine learning model training speed by distributing workloads across multiple machines.
This approach is particularly beneficial for organizations dealing with large datasets, as it allows for faster experimentation and iteration on models.
2
Implementing Michelangelo can streamline your machine learning operations by providing a cohesive platform for model development and deployment.
This is especially useful for teams looking to reduce the complexity of their ML workflows and improve collaboration among data scientists.
3
Leveraging Pyro for probabilistic modeling can provide deeper insights into user behavior patterns, which can enhance decision-making processes.
Data scientists can use Pyro to create more accurate predictions, which is crucial for tailoring services to meet user needs effectively.

Common Pitfalls

1
Failing to properly integrate machine learning tools like Horovod and Michelangelo can lead to inefficient workflows.
This often happens when teams do not fully understand the capabilities of these tools or how they can complement each other, resulting in suboptimal performance.