GTC 21: Top 5 NVIDIA AI / DL Technical Sessions

Brad Nemire

With more than 1,400 sessions including the latest deep learning technologies in conversational AI, recommender systems, computer vision, and video streaming…

NVIDIA

•

Brad Nemire

•3 min read•intermediate•

--

•View Original

Deep LearningKubernetesTransfer Learning

Overview

The article highlights the top five AI and deep learning sessions at NVIDIA GTC 21, showcasing advancements in conversational AI, recommender systems, and video streaming technologies. It emphasizes practical applications and tools that enhance AI workflows and model deployment.

What You'll Learn

1

How to customize ASR and NLP pipelines for a conversational AI application

2

How to accelerate recommender systems using the Merlin framework

3

How to utilize NGC artifacts for building conversational AI solutions

4

How to deploy AI models at scale using Triton Inference Server

Key Questions Answered

What are the benefits of using the Merlin framework for recommender systems?

The Merlin framework accelerates recommender systems on GPU, improving ETL tasks, model training, and inference serving by approximately 10 times compared to traditional methods. It is designed for ease of use and integration with existing pipelines.

How does NVIDIA Maxine enhance video conferencing applications?

NVIDIA Maxine reduces video bandwidth usage to one-tenth of H.264 through AI video compression. It also includes features like face alignment, gaze correction, and real-time translation, enhancing the overall video conferencing experience.

What is Triton Inference Server and how does it simplify model deployment?

Triton Inference Server is an open-source model serving software that simplifies the deployment of AI models at scale. It supports concurrent execution and dynamic batching, making it easier to serve models from various frameworks on diverse infrastructures.

Key Statistics & Figures

Performance improvement in recommender systems

10x

Achieved by using the Merlin framework for ETL, training, and inference serving.

Video bandwidth reduction

1/10th of H.264

Accomplished through AI video compression in NVIDIA Maxine.

Technologies & Tools

Software

Nvidia Transfer Learning Toolkit

Used for customizing deep learning models in conversational AI applications.

Framework

Merlin

Framework for accelerating recommender systems on GPU.

Software

Triton Inference Server

Model serving software that simplifies the deployment of AI models.

SDK

Nvidia Maxine

Platform SDK for developers of video conferencing services.

Key Actionable Insights

1
Leverage the Merlin framework to enhance the performance of your recommender systems significantly.
By integrating NVTabular for ETL and HugeCTR for training, you can achieve a tenfold increase in efficiency, which is crucial for handling large datasets in production environments.

2
Utilize Triton Inference Server for deploying AI models at scale to streamline your production workflows.
This tool allows for high-performance inference serving and can integrate seamlessly with Kubernetes, making it ideal for teams looking to optimize their deployment processes.

3
Explore NVIDIA Maxine to reduce bandwidth usage in video applications while enhancing user experience.
Implementing AI-driven features like noise removal and virtual assistants can significantly improve the quality of video conferencing, which is increasingly important in remote work settings.

Common Pitfalls

1

Failing to properly fine-tune AI models for specific domains can lead to suboptimal performance.

Without adequate customization, models may not meet enterprise requirements, resulting in wasted resources and time during deployment.