Enabling Offline Inferences at Uber Scale

Neeraj Dhake, Aravind Ranganathan
12 min readadvanced
--
View Original

Overview

The article discusses Uber's approach to automating offline inferences using machine learning and natural language processing on support interaction data. It details the challenges faced, the design of the solution, and the technologies utilized to streamline the process for engineers and data scientists.

What You'll Learn

1

How to automate the analysis of support interactions using ML and NLP

2

Why it's crucial to maintain model accuracy over time in production systems

3

How to leverage Uber's Michelangelo platform for custom model training

4

When to implement a no-code workflow orchestrator for data pipelines

Prerequisites & Requirements

  • Understanding of machine learning concepts and natural language processing
  • Familiarity with Uber's Michelangelo platform(optional)

Key Questions Answered

How does Uber automate the analysis of support interactions?
Uber automates the analysis of support interactions by applying machine learning and natural language processing algorithms to semi-structured and unstructured data. This allows for daily processing of support interactions to identify root causes of customer issues and categorize them effectively, improving the overall user experience.
What challenges did Uber face in their initial ML workflow?
Uber faced several challenges including stale reports due to lack of refresh cadence, data pipeline issues affecting report accuracy, and decreasing model accuracy over time. These issues highlighted the need for a more robust and automated system to maintain the quality and reliability of their ML inferences.
What technologies did Uber use for handling high scale batch inferences?
Uber utilized Apache Spark for handling high scale batch inferences, leveraging its capabilities to manage over one million predictions per execution. This choice was supported by the SparkMagic kernel in their workbench notebook, allowing seamless integration with their existing infrastructure.
How does Uber ensure ML model accuracy over time?
Uber ensures ML model accuracy by capturing model versions in their output tables, allowing for easy tracking of inference predictions. This setup enables continuous evaluation of model performance and facilitates iterative improvements to the models based on real-world feedback.

Key Statistics & Figures

Number of predictions per execution
upwards of one million
This statistic highlights the scale at which Uber operates its batch inference jobs.
Initial ML classifier accuracy
60%
This low accuracy prompted the need for iterative improvements and backfilling of predictions with newer model versions.

Technologies & Tools

Some links below are affiliate links. We may earn a commission if you make a purchase.

Backend
Apache Spark
Used for handling high scale batch inferences.
Database
Apache Hive
Used for data warehousing and querying support interaction data.
ML Platform
Michelangelo
Provides model hosting and training capabilities for Uber's ML workflows.
Workflow Orchestrator
Uworc
Facilitates the management of data workflows with a no-code interface.

Key Actionable Insights

1
Implement a refresh cadence for your ML inferences to ensure that reports are always up-to-date.
Regularly refreshing your inferences can prevent stale data from impacting decision-making and improve the accuracy of insights derived from your models.
2
Utilize a no-code workflow orchestrator like uWorc to simplify the management of data pipelines.
This approach allows data analysts to create and manage complex data workflows without needing extensive programming knowledge, thus speeding up the development process.
3
Incorporate model versioning in your ML systems to facilitate backfilling and performance tracking.
By maintaining a record of which model version produced each inference, you can easily assess improvements over time and ensure that historical data is re-evaluated with the latest models.

Common Pitfalls

1
Failing to maintain a refresh cadence for ML reports can lead to stale data.
Without regular updates, insights derived from ML models may become outdated, leading to poor decision-making based on inaccurate information.
2
Overcomplicating data workflows can hinder productivity.
Using overly complex data pipelines can create bottlenecks and make it difficult for data analysts to perform their tasks efficiently. Simplifying workflows with no-code tools can mitigate this issue.

Related Concepts

Machine Learning
Natural Language Processing
Data Pipeline Management
Model Versioning