How NVIDIA Uses Hugging Face
188 engineering articles about Hugging Face from NVIDIA's engineering team
Other NVIDIA Technologies
Other Companies Using Hugging Face
Articles
Filter:
The article discusses the use of NVFP4 low-precision model training to achieve higher throughput without sacrificing accuracy in AI model training.
Aditya Vavre
7 min read
Includes Code
Has Summary
--
The article discusses how NVIDIA's hardware-software co-design significantly enhanced the inference performance of Sarvam AI's Sovereign 30B model, achieving a 4x speedup on NVIDIA Blackwell archit...
Utkarsh Uppal
14 min read
Has Summary
--
The article discusses NVIDIA TensorRT LLM AutoDeploy, a beta feature that automates the inference optimization process for large language models (LLMs).
Lucas Liebenwein
8 min read
Includes Code
Has Summary
--
Kimi K2. 5 is an advanced multimodal vision language model (VLM) developed by Kimi, optimized for various AI tasks.
Anu Srivastava
4 min read
Includes Code
Has Summary
--
The article provides a comprehensive guide on building a document processing pipeline using NVIDIA Nemotron RAG, focusing on the extraction of structured data from complex documents like PDFs.
Chia-Chih Chen
9 min read
Includes Code
Has Summary
--
The article discusses how to utilize NVIDIA Earth-2 to downscale coarse climate projections into high-resolution, bias-corrected fields, enabling better assessment of local climate extremes.
Georg Ertl
11 min read
Includes Code
Has Summary
--
This article explores how to train an AI agent to operate a new Command Line Interface (CLI) using synthetic data generation and reinforcement learning.
Chris Alexiuk
11 min read
Includes Code
Has Summary
--
The article discusses the development of generalist humanoid capabilities using NVIDIA Isaac GR00T N1. 6 through a sim-to-real workflow.
Edith Llontop
7 min read
Has Summary
--
The article discusses the introduction of NVIDIA TensorRT Edge-LLM, an open-source C++ framework designed for high-performance inference of Large Language Models (LLMs) and Vision Language Models (...
Lin Chai
5 min read
Includes Code
Has Summary
--
The article discusses the latest software and model optimizations for NVIDIA DGX Spark, highlighting significant performance improvements in AI workflows.
Allen Bourgoyne
5 min read
Has Summary
--
The article discusses the NVIDIA Rubin platform, which introduces six new chips designed to create a powerful AI supercomputer.
Kyle Aubrey
59 min read
Has Summary
--
The article introduces NVIDIA Isaac Lab-Arena, an open-source framework designed for efficient and scalable evaluation of generalist robot policies in simulation.
Sangeeta Subramanian
9 min read
Includes Code
Has Summary
--
This article provides a comprehensive tutorial on building a voice agent using NVIDIA's Nemotron models, focusing on retrieval-augmented generation (RAG) and safety guardrails.
Chris Alexiuk
8 min read
Includes Code
Has Summary
--
The article discusses NVIDIA's Alpamayo, a comprehensive ecosystem designed for developing reasoning-based autonomous vehicle (AV) systems.
Marco Pavone
11 min read
Includes Code
Has Summary
--
The article discusses the integration of AI Physics into Technology Computer-Aided Design (TCAD) simulations, highlighting its significance in semiconductor manufacturing.
Ram Cherukuri
7 min read
Has Summary
--
The article discusses the NVIDIA Nemotron 3, a family of open models designed for agentic AI systems, emphasizing its efficiency and accuracy through innovative architectures and techniques.
Chris Alexiuk
9 min read
Has Summary
--
The article discusses the creation of privacy-preserving evaluation benchmarks using synthetic data, particularly in regulated domains like healthcare.
Isabel Hulseman
11 min read
Includes Code
Has Summary
--
The article discusses the implementation of Edge AI on the NVIDIA Jetson platform, focusing on the use of Large Language Models (LLMs), Vision Language Models (VLMs), and Foundation Models in robot...
Chitoku Yato
9 min read
Includes Code
Has Summary
--
The article discusses enhancing the quality of 3D Gaussian reconstruction for simulation, focusing on the use of NVIDIA's Fixer model to eliminate rendering artifacts.
Wonsik Han
7 min read
Includes Code
Has Summary
--
The NVIDIA-accelerated Mistral 3 open model family offers developers and enterprises industry-leading accuracy, efficiency, and customization capabilities.
Anu Srivastava
6 min read
Has Summary
--
The article introduces Broadened Reinforcement Learning (BroRL), a new paradigm that enhances the training of large language models (LLMs) by focusing on rollout scaling rather than just increasing...
Jian Hu
6 min read
Has Summary
--
This article discusses how to achieve 4x faster inference for math problem solving using large language models by optimizing the serving stack, quantization strategy, and decoding methods.
Igor Gitman
7 min read
Includes Code
Has Summary
--
The article discusses NVIDIA Grove, a Kubernetes API designed to streamline complex AI inference workloads by managing multicomponent systems.
Sanjay Chatterjee
9 min read
Includes Code
Has Summary
--
The article discusses the benchmarking of AI coding assistants in writing efficient CUDA code using the ComputeEval framework.
Daniel Rodriguez
2 min read
Has Summary
--
The article discusses how NVIDIA's NeMo Automodel simplifies the training of large-scale mixture-of-experts (MoE) models in PyTorch, making it accessible to a broader audience.
Hemil Desai
7 min read
Includes Code
Has Summary
--
The article discusses how to scale biology transformer models using PyTorch and NVIDIA BioNeMo Recipes, focusing on advanced parallel computing techniques and the integration of the NVIDIA Transfor...
Kyle Tretina
6 min read
Includes Code
Has Summary
--
The article discusses the advancements in Explainable AI for radiology through NVIDIA Clara Reason, focusing on the NV-Reason-CXR-3B model that enhances diagnostic transparency and mimics radiologi...
Andriy Myronenko
11 min read
Includes Code
Has Summary
--
The article discusses how NVIDIA Run:ai enhances AI infrastructure management on Microsoft Azure by optimizing GPU utilization and simplifying workload orchestration.
Julie Adrounie
8 min read
Has Summary
--
The article introduces CodonFM, a new state-of-the-art RNA foundation model developed by NVIDIA as part of the Clara open model family.
Kyle Gion
10 min read
Includes Code
Has Summary
--
The article discusses the launch of NVIDIA's new Nemotron models designed for developing specialized AI agents that integrate language and vision capabilities.
Chris Alexiuk
8 min read
Has Summary
--
The article discusses how the NVIDIA DGX Spark supercomputer enhances performance for intensive AI tasks, providing a local alternative to cloud computing.
Allen Bourgoyne
5 min read
Has Summary
--
This article outlines a streamlined process for reconstructing 3D environments for robotics simulation using only a smartphone, specifically an iPhone.
Wonsik Han
10 min read
Includes Code
Has Summary
--
The article discusses how to fine-tune and scale large language models (LLMs) using the open-source Unsloth framework on NVIDIA Blackwell GPUs.
Paul Abruzzo
5 min read
Includes Code
Has Summary
--
This article guides readers through the process of creating a Bash computer use agent using the NVIDIA Nemotron Nano v2 model.
Mehran Maghoumi
14 min read
Includes Code
Has Summary
--
The article discusses building an AI agent using NVIDIA Nemotron to analyze IT tickets, focusing on extracting insights from unstructured data through advanced AI reasoning and graph databases.
Bhaskar Bhowmik
10 min read
Includes Code
Has Summary
--
The article discusses the development of an AI-powered log analysis solution using NVIDIA's Generative AI reference workflows.
Prashant Bhende
5 min read
Includes Code
Has Summary
--
The article discusses the optimization of large language models (LLMs) through pruning and knowledge distillation using NVIDIA TensorRT Model Optimizer.
Max Xu
10 min read
Includes Code
Has Summary
--
The article discusses the integration of the NVIDIA KAI Scheduler with Ray, enabling advanced scheduling features like gang scheduling, workload prioritization, and autoscaling in Ray clusters.
Ekin Karabulut
9 min read
Includes Code
Has Summary
--
The article discusses the integration of NVIDIA Run:ai v2. 23 with NVIDIA Dynamo to address the challenges of large language model (LLM) inference across distributed environments.
Ekin Karabulut
9 min read
Includes Code
Has Summary
--
The article discusses how OpenUSD can enhance robotics development through improved data ingestion, aggregation, and the use of SimReady assets.
Matias Codesal
6 min read
Includes Code
Has Summary
--
The article discusses the integration of computer vision pipelines with Generative AI and reasoning, highlighting the advancements in video analytics through NVIDIA's Blueprint for Video Search and...
Samuel Ochoa
11 min read
Has Summary
--
The article provides a comprehensive guide on building a Retrieval-Augmented Generation (RAG) agent using NVIDIA Nemotron, emphasizing the integration of external information to enhance text genera...
Edward Li
16 min read
Includes Code
Has Summary
--
The article introduces speculative decoding as a technique to reduce latency in AI inference, particularly for large language models (LLMs).
Jamie Li
10 min read
Includes Code
Has Summary
--
The article discusses the challenges of cold start latency in deploying large language models (LLMs) and introduces the NVIDIA Run:ai Model Streamer, an open-source Python SDK designed to optimize ...
Omer Dayan
12 min read
Has Summary
--
This article provides a comprehensive guide on building a report generation AI agent using NVIDIA Nemotron on OpenRouter.
Edward Li
13 min read
Includes Code
Has Summary
--
The article discusses the release of two new open-source models, Qwen3-Next 80B-A3B-Thinking and Qwen3-Next 80B-A3B-Instruct, which utilize a hybrid Mixture of Experts (MoE) architecture to enhance...
Anu Srivastava
4 min read
Includes Code
Has Summary
--
The article discusses how Quantization Aware Training (QAT) and Quantization Aware Distillation (QAD) can enhance low-precision model accuracy recovery beyond traditional Post-Training Quantization...
Eduardo Alvarez
9 min read
Includes Code
Has Summary
--
The article discusses the deployment of scalable AI inference using NVIDIA NIM Operator 3. 0. 0, highlighting its capabilities in managing AI inference pipelines across Kubernetes environments.
Meenakshi Kaushik
6 min read
Includes Code
Has Summary
--
The article discusses how to build in-house AI systems using Outerbounds and NVIDIA DGX Cloud Lepton, emphasizing the importance of orchestrating multiple models and dynamic data.
Ville Tuulos
10 min read
Includes Code
Has Summary
--
The article discusses how to enhance the efficiency of Large Language Models (LLMs) during inference by utilizing CPU-GPU memory sharing through NVIDIA's NVLink C2C technology.
Afroze Syed
6 min read
Includes Code
Has Summary
--