Enabling Deep Model Explainability with Integrated Gradients at Uber

Hugh Chen, Eric Wang, Gaoyuan Huang, Howard Yu, Jia Li, Sally Lee

Uber

•

Hugh Chen, Eric Wang, Gaoyuan Huang, Howard Yu, Jia Li, Sally Lee

•14 min read•advanced•

--

•View Original

EmbeddingKerasLIMEMachine LearningPyTorchSHAPTensorFlowXGBoostYAML

Overview

This article discusses how Uber has integrated explainability into its machine learning platform, Michelangelo, using Integrated Gradients (IG) to provide interpretable attributions for deep learning models. It highlights the engineering challenges faced during implementation and the practical applications of IG in various use cases.

What You'll Learn

1

How to implement Integrated Gradients for model explainability in TensorFlow and PyTorch

2

Why explainability is crucial for trust in machine learning models

3

When to use Integrated Gradients versus other attribution methods like SHAP and LIME

Prerequisites & Requirements

Understanding of machine learning concepts and deep learning frameworks
Familiarity with TensorFlow and PyTorch(optional)

Key Questions Answered

How does Uber implement explainability in its machine learning models?

Uber integrates explainability into its ML platform, Michelangelo, using Integrated Gradients (IG) to provide interpretable attributions for deep learning models. This implementation supports both TensorFlow and PyTorch, ensuring consistency across frameworks and enabling teams to understand model decisions effectively.

What are the challenges faced when implementing Integrated Gradients?

Challenges include saving and loading raw models for differentiability, wrapping models for consistent attribution computation, and ensuring calibration support for decision thresholds. Additionally, handling categorical features and enabling multi-layer explanations were significant hurdles that required innovative solutions.

What are the use cases for Integrated Gradients at Uber?

Integrated Gradients are used for regulatory accountability, model debugging, feature validation, and operational monitoring. These applications help teams understand model behavior, validate feature importance, and trace anomalies in production systems, enhancing trust and reliability in ML outputs.

Key Statistics & Figures

Reduction in attribution time

over 80%

Achieved by parallelizing IG computation using Ray, improving efficiency in large-scale pipelines.

Technologies & Tools

Some links below are affiliate links. We may earn a commission if you make a purchase.

Backend

Tensorflow

Used for building and deploying machine learning models with Integrated Gradients.

Backend

Pytorch

Another framework supported for implementing Integrated Gradients in Uber's ML platform.

Backend

Ray

Used for parallelizing the computation of Integrated Gradients to enhance performance.

Key Actionable Insights

1
Integrate explainability into your ML workflows using Integrated Gradients to enhance model trust.
By adopting IG, teams can provide clear attributions for model predictions, which is essential for regulatory compliance and stakeholder trust.

2
Utilize Jupyter notebooks for interactive exploration of model predictions and attributions.
This allows for on-demand analysis and debugging, making it easier to investigate specific model behaviors and improve decision-making.

3
Implement parallel processing with Ray to optimize the performance of Integrated Gradients.
This approach significantly reduces computation time, making it feasible to use IG in large-scale models and batch evaluations.

Common Pitfalls

1

Selecting the wrong baseline inputs and layers for attribution can lead to misleading explanations.

This requires a deep understanding of the model architecture, making it crucial for users to familiarize themselves with IG-specific concepts and best practices to avoid errors.

Related Concepts

Machine Learning Interpretability

Feature Attribution Methods

Deep Learning Frameworks