NVIDIA TensorRT™ is a high-performance deep learning inference optimizer and runtime that delivers low latency, high-throughput inference for deep learning…
Overview
NVIDIA TensorRT™ is a deep learning inference optimizer that enhances performance for TensorFlow applications. The release of TensorRT 3 introduces a TensorFlow Model Importer, a Python API, and Volta Tensor Core Support, significantly improving inference speed on Tesla V100 GPUs.
What You'll Learn
How to import and optimize TensorFlow models using TensorRT
Why using a Python API can improve productivity in deep learning inference
When to leverage Volta Tensor Core Support for faster inference
Key Questions Answered
What are the key features introduced in TensorRT 3?
How does TensorRT improve TensorFlow inference performance?
What is the benefit of using the Python API in TensorRT?
What performance improvements can be expected with Volta Tensor Core Support?
Key Statistics & Figures
Technologies & Tools
Some links below are affiliate links. We may earn a commission if you make a purchase.
Key Actionable Insights
1Utilize the TensorFlow Model Importer to streamline your workflow.This feature allows you to easily import and optimize your existing TensorFlow models, saving time and reducing complexity in the deployment process.
2Leverage the Python API for rapid development and testing.The Python API simplifies interactions with TensorRT, making it easier to prototype and iterate on deep learning models without getting bogged down in lower-level details.
3Consider upgrading to Tesla V100 GPUs to maximize performance gains.If your application demands high throughput and low latency, the performance improvements offered by Volta Tensor Core Support can significantly enhance your inference capabilities.