Data Science &#x2d; Top Resources from GTC 21

Chase Hooley

Accelerated data science can dramatically boost the performance of end-to-end analytics workflows, speeding up value generation while reducing cost.

NVIDIA

•

Chase Hooley

•3 min read•intermediate•

--

•View Original

DaskMachine Learning

Overview

The article discusses how accelerated data science can enhance data analytics workflows by leveraging NVIDIA technologies, significantly improving performance and reducing costs. It highlights various sessions from GTC 21 that showcase real-world applications of GPU computing in companies like Spotify and Walmart.

What You'll Learn

1

How to utilize cuDF and Dask-CUDF for interactive model evaluation

2

Why GPU acceleration is crucial for enhancing recommender systems

3

How to deploy GPU-accelerated applications across hybrid and multi-clouds

Key Questions Answered

How does Spotify reduce model evaluation time using NVIDIA technologies?

Spotify implemented cuDF and Dask-CUDF to transform their offline evaluation process, reducing the time from hours to minutes. This acceleration allows for increased iteration speed and improved model development.

What benefits does Walmart gain from using GPU computing?

Walmart has developed applications that leverage GPU computing to handle computationally intensive processes. The article discusses performance comparisons between CPU and GPU, highlighting capabilities that are only feasible with GPU architectures.

What is the Merlin framework and its components?

The Merlin framework consists of NVTabular for ETL, HugeCTR for training, and Triton for inference serving. It accelerates recommender systems on GPU, improving performance by approximately 10x compared to traditional methods.

How does Cloudera Data Platform deploy GPU-accelerated applications?

Cloudera Data Platform uses a single pane of glass to manage and deploy GPU-accelerated applications across hybrid and multi-cloud environments, simplifying the deployment process for users.

Key Statistics & Figures

Reduction in model evaluation time

From hours to minutes

This statistic highlights the efficiency gained by Spotify through the use of NVIDIA technologies.

Performance improvement factor for recommender systems

~10x

This improvement is achieved by using the Merlin framework compared to commonly used methods.

Technologies & Tools

Library

Cudf

Used by Spotify for interactive model evaluation.

Library

Dask-cudf

Facilitates the reduction of model evaluation time at Spotify.

Framework

Merlin

Accelerates recommender systems on GPU.

Library

Nvtabular

Part of the Merlin framework for ETL tasks.

Library

Hugectr

Used for training in the Merlin framework.

Library

Triton

Used for inference serving in the Merlin framework.

Key Actionable Insights

1
Leverage NVIDIA's cuDF and Dask-CUDF to enhance your model evaluation processes.
By adopting these tools, you can significantly reduce the time taken for model evaluations, allowing for quicker iterations and improved model performance.

2
Consider implementing the Merlin framework for your recommender systems to achieve better performance.
Using NVTabular, HugeCTR, and Triton can speed up ETL tasks and model training, making your recommendation processes more efficient.

3
Utilize GPU computing for computationally intensive applications to unlock new capabilities.
As demonstrated by Walmart, GPU computing can enable processes that are not feasible with CPU-only architectures, leading to significant performance gains.