NVIDIA logo

How NVIDIA Uses RoBERTa

7 engineering articles about RoBERTa from NVIDIA's engineering team

Articles

Filter:
NVIDIA logo
NVIDIA
Advanced
The article discusses how to efficiently scale large language model (LLM) training across a large GPU cluster using the open-source frameworks Alpa and Ray.
NVIDIA logo
NVIDIA
Intermediate
This article discusses the deployment of NVIDIA TensorRT for AI inference on NVIDIA hardware, focusing on optimizing performance and compatibility.
Maximilian Müller
10 min read
Includes Code
Has Summary
--
NVIDIA logo
NVIDIA
Intermediate
This article discusses how to leverage RAPIDS, HuggingFace, and Dask to run state-of-the-art NLP workloads at scale on GPUs.
Vibhu Jawa
7 min read
Includes Code
Has Summary
--
NVIDIA logo
NVIDIA
Intermediate
This article discusses the advancements in language modeling using Megatron on the NVIDIA A100 GPU, highlighting the significant improvements in natural language processing tasks achieved through m...
Mohammad Shoeybi
9 min read
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
The Allen Institute for Artificial Intelligence has achieved a significant milestone with its BERT-based model, Aristo, which successfully passed a 12th-grade science exam with an accuracy of 83%.
NVIDIA logo
NVIDIA
Advanced
NVIDIA has achieved a groundbreaking milestone by training BERT-Large in just 47 minutes using the DGX SuperPOD, and has also developed the largest Transformer-based model, GPT-2 8B, with 8.
Shar Narasimhan
8 min read
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
The article discusses the optimizations NVIDIA has made to the BERT model using TensorRT, enabling real-time natural language understanding with significantly reduced latency.
Purnendu Mukherjee
19 min read
Includes Code
Has Summary
--

You've reached the end! All 7 articles loaded.