How Uber Uses Ray® to Optimize the Rides Business

Kaichen Wei, Matt Walker, Peng Zhang

Uber

•

Kaichen Wei, Matt Walker, Peng Zhang

•15 min read•advanced•

--

•View Original

ApacheApache SparkAWSDockerKubernetesPandasPySparkXGBoost

Overview

The article discusses how Uber utilizes Ray®, a general compute engine for Python®, to enhance the efficiency of its rides business through improved machine learning model performance and optimization algorithms. It highlights the significant performance improvements achieved, including a 40-fold increase in speed for budget allocation optimization, and the integration of Ray with existing systems.

What You'll Learn

1

How to leverage Ray for parallelizing Python functions in machine learning applications

2

Why integrating Spark and Ray can optimize data processing and computation tasks

3

How to improve deployment and launch times for machine learning jobs using object storage

Prerequisites & Requirements

Understanding of machine learning concepts and optimization algorithms
Familiarity with Ray and Spark frameworks(optional)

Key Questions Answered

How does Uber optimize its rides business using Ray?

Uber optimizes its rides business by using Ray to parallelize machine learning models and optimization algorithms, which significantly enhances computational efficiency. This integration allows for faster processing and improved developer productivity, achieving performance improvements of up to 40 times in budget allocation tasks.

What challenges did Uber face when migrating from Spark to Ray?

Uber encountered several challenges during the migration from Spark to Ray, including bottlenecks in parallel processing and the need to rewrite legacy PySpark code. The integration of both frameworks was necessary to leverage their respective strengths while minimizing migration costs.

What is the role of the ADMM optimizer in Uber's budget allocation?

The ADMM (Alternating Direction Method of Multipliers) optimizer is used to allocate budgets across various city-levers efficiently. It handles non-linear, non-convex problems and scales well as more cities or levers are added, allowing Uber to optimize its incentive structures effectively.

Key Statistics & Figures

Performance improvement in budget allocation optimization

40 times

This improvement was achieved by integrating Ray into Uber's existing systems.

Technologies & Tools

Some links below are affiliate links. We may earn a commission if you make a purchase.

Backend

Ray

Used for parallelizing machine learning computations and optimization algorithms.

Backend

Spark

Used for data processing and handling large datasets.

Tools

Docker

Used for containerizing applications to streamline deployment.

Storage

Amazon S3

Used as an intermediate storage layer for application code.

Key Actionable Insights

1
Implementing Ray can drastically improve the performance of parallel computations in machine learning applications.
By allowing Python functions to run in parallel, Ray reduces the time required for processing large datasets, making it ideal for high-concurrency environments like Uber.

2
Using a hybrid approach with both Spark and Ray can optimize resource utilization and minimize bottlenecks.
This strategy allows Uber to leverage Spark's data processing capabilities while utilizing Ray for tasks that require high concurrency, thus improving overall system efficiency.

3
Optimizing deployment processes can significantly enhance development iteration speed.
By using object storage to manage application code changes, Uber reduced deployment times to just 2 minutes, allowing engineers to iterate more quickly on their projects.

Common Pitfalls

1

Relying solely on Spark for all computations can lead to performance bottlenecks.

Spark is not optimized for all types of Python code, particularly those requiring high concurrency. This can slow down processing times significantly.

2

Underestimating the complexity of migrating legacy code to a new framework.

Migrating from Spark to Ray requires significant effort to rewrite existing PySpark code, which can be a major hurdle for teams without adequate resources.

Related Concepts

Parallel Computing In Machine Learning

Optimization Algorithms In Data Science

Integration Of Distributed Systems