#
Reinforcement Learning Programming Tutorials & Engineering Articles
94 Reinforcement Learning tutorials, guides, and engineering insights from NVIDIA, OpenAI, Uber, and more
Companies Using This
Reinforcement Learning Articles & Tutorials
Filter:
This article explores how to train an AI agent to operate a new Command Line Interface (CLI) using synthetic data generation and reinforcement learning.
Chris Alexiuk
11 min read
Includes Code
Has Summary
--
The article discusses the NVIDIA Nemotron 3, a family of open models designed for agentic AI systems, emphasizing its efficiency and accuracy through innovative architectures and techniques.
Chris Alexiuk
9 min read
Has Summary
--
The article discusses the development of scientific AI agents using reinforcement learning (RL) techniques, specifically through the NVIDIA NeMo framework.
Christian Munley
12 min read
Includes Code
Has Summary
--
The article introduces Broadened Reinforcement Learning (BroRL), a new paradigm that enhances the training of large language models (LLMs) by focusing on rollout scaling rather than just increasing...
Jian Hu
6 min read
Has Summary
--
Netflix introduces Advantage-Weighted Supervised Fine-Tuning (A-SFT), a novel post-training algorithm for generative recommender systems that addresses the unique challenges of applying reinforceme...
Netflix Technology Blog
12 min read
Has Summary
--
The article introduces Tunix, a new open-source, JAX-native library designed for post-training of large language models (LLMs).
Srikanth Kilaru, Tianshu Bao
7 min read
Includes Code
Has Summary
--
Train a Quadruped Locomotion Policy and Simulate Cloth Manipulation with NVIDIA Isaac Lab and Newton
This article discusses the integration of the Newton physics engine with NVIDIA Isaac Lab for training quadruped locomotion policies and simulating cloth manipulation.
The article discusses the enhancements in reinforcement learning training throughput using NVIDIA NeMo-RL with Megatron-Core support.
Anna Shors
7 min read
Includes Code
Has Summary
--
The article discusses the advancements in reinforcement learning for large language models (LLMs) through the introduction of ProRL v2 by NVIDIA Research.
Jian Hu
7 min read
Includes Code
Has Summary
--
The article discusses the release of the NVIDIA Llama Nemotron Super 49B v1. 5, highlighting its advancements in accuracy, efficiency, and reasoning capabilities for AI agents.
Chris Alexiuk
5 min read
Has Summary
--
The article introduces NVIDIA NeMo-RL, an open-source library for reinforcement learning that supports scalable training from single-GPU to thousand-GPU models.
Alexander Bukharin
5 min read
Includes Code
Has Summary
--
This article discusses how Uber utilizes reinforcement learning techniques to enhance the efficiency of its marketplace by improving the balance between drivers and demand.
Prateek Jain, Soheil Sadeghi, Mehrdad Bakhtiari
11 min read
Has Summary
--
The article discusses the advancements in AI autonomy through NVIDIA's Nemotron open reasoning models, which enhance AI agents' decision-making capabilities in complex environments.
Nirmal Kumar Juluru
6 min read
Has Summary
--
The article discusses the development and capabilities of NVIDIA's Llama Nemotron reasoning models, which enhance AI agents' reasoning abilities for complex problem-solving in various industries.
Chris Alexiuk
11 min read
Has Summary
--
The article discusses how NVIDIA Cosmos World Foundation Models (WFMs) enhance the development of AI-driven robots and autonomous vehicles by providing high-fidelity, physics-aware synthetic data.
Pranjali Joshi
7 min read
Includes Code
Has Summary
--
Gemma 3 is the latest version of the Gemma open-model family, boasting enhanced capabilities such as multimodality, longer context windows, and improved reasoning.
Omar Sanseviero, Philipp Schmid
5 min read
Includes Code
Has Summary
--
The article discusses a new alignment strategy called deliberative alignment, which teaches reasoning to language models to enhance their safety.
Melody Guan
8 min read
Has Summary
--
The article discusses the development of domain-adapted foundation GenAI models at LinkedIn, focusing on their application within the Economic Opportunity Network (EON) project.
Praveen Kumar Bodigutla
12 min read
Has Summary
--
The article discusses advancements in red teaming methodologies at OpenAI, focusing on the integration of human and AI efforts to identify potential risks in AI systems.
OpenAI
8 min read
Has Summary
--
The article discusses Airbnb's evolution in location retrieval from simple heuristics to advanced machine learning and reinforcement learning techniques.
Dillon Davis
9 min read
Has Summary
--
The article discusses Netflix's approach to enhancing long-term member satisfaction through its recommendation algorithms.
Netflix Technology Blog
9 min read
Has Summary
--
The article discusses the development of Pinterest Canvas, a text-to-image foundation model aimed at enhancing existing images and products on the Pinterest platform.
Pinterest Engineering
10 min read
Has Summary
--
The article discusses CriticGPT, a model based on GPT-4, designed to identify errors in ChatGPT responses.
Nat McAleese
5 min read
Includes Code
Has Summary
--
The article discusses the challenges and methodologies involved in training quadruped locomotion policies using NVIDIA Isaac Lab, emphasizing the importance of high-fidelity simulation for bridging...
Oyindamola Omotuyi
11 min read
Includes Code
Has Summary
--
The article discusses the advancements in robotics workflows through the latest release of NVIDIA Isaac Sim 4. 0 and NVIDIA Isaac Lab.
Akhil Docca
10 min read
Has Summary
--
The article discusses the integration of AI in enabling practical quantum computing by addressing challenges in quantum processors, error correction, and algorithm development.
Mark Wolf
6 min read
Has Summary
--
The article introduces DRaFT+, an enhanced algorithm for fine-tuning text-to-image diffusion models, which aims to improve the alignment between input prompts and generated images.
Ali Taghibakhshi
9 min read
Includes Code
Has Summary
--
The article discusses Yoko Li's innovative work in AI, focusing on her projects like AI Town and AI Tamago, which utilize emergent behavior and large language models.
Xe Iaso
10 min read
Has Summary
--
The article discusses how to leverage Palantir AIP to build a semantic search application that uncovers insights from unstructured data within enterprises.
Palantir
6 min read
Includes Code
Has Summary
--
The article discusses how NVIDIA NeMo can streamline the development of generative AI applications on GPU-accelerated Google Cloud.
BERTDaskFine-tuningGenerative AIGoogle CloudGPTHugging FacePythonRedisReinforcement LearningT5Transformer
Chintan Patel
9 min read
Has Summary
--
QHack 2023 showcased the intersection of quantum computing and machine learning, featuring 2,850 participants from 105 countries competing to develop innovative solutions using NVIDIA's quantum tec...
Tom Lubowe
8 min read
Has Summary
--
The article discusses how AutoDMP leverages AI and GPU technology to optimize macro placement in chip design, significantly improving performance and efficiency.
Anthony Agnesina
10 min read
Has Summary
--
The article summarizes the advancements and AI-powered solutions introduced in 2022, highlighting the most popular posts on the NVIDIA Technical Blog.
Michelle Horton
3 min read
Includes Code
Has Summary
--
The article discusses the DeXtreme project, which utilizes simulation to teach dexterity to a real robot hand.
Gavriel State
7 min read
Has Summary
--
This article discusses the application of reinforcement learning to create optimal recommendation systems that consider users' time budgets.
Netflix Technology Blog
13 min read
Has Summary
--
NVIDIA researchers introduced Factory, a novel simulation approach designed to enhance robotic assembly by enabling real-time, accurate simulations of contact-rich interactions.
Oyindamola Omotuyi
10 min read
Has Summary
--
The article discusses the innovative use of deep reinforcement learning (RL) to design arithmetic circuits, particularly in the context of NVIDIA GPUs.
Rajarshi Roy
8 min read
Includes Code
Has Summary
--
The article discusses NVIDIA's advancements in MLPerf Training v2. 0, highlighting the full-stack optimizations that enhance performance across various AI workloads.
Ashraf Eassa
14 min read
Includes Code
Has Summary
--
The article discusses the performance improvements achieved in the NVIDIA MLPerf Training v1. 1 benchmark through full stack optimization.
NVIDIA has released an updated Edge AI and Robotics Teaching Kit aimed at university educators, developed in collaboration with experts from the University of Oxford and the University of Maryland,...
Jason Black
3 min read
Has Summary
--
The article discusses NVIDIA's research on transferring dexterous manipulation capabilities from GPU simulation to real-world robotic applications.
Varun Lodaya
11 min read
Has Summary
--
This article discusses how Airbnb utilizes task-oriented conversational AI to enhance customer support for hosts and guests.
Gavin Li
20 min read
Has Summary
--
The article discusses the MLPerf v1. 0 training benchmarks, highlighting NVIDIA's record-setting performance across various AI workloads.
The article discusses the Isaac SDK and Isaac Sim, NVIDIA's robotics platform designed to accelerate the development of robotics applications through GPU optimization for AI and computer vision.
Brad Nemire
3 min read
Has Summary
--
The article discusses how AI can enhance fraud detection and prevention in banking, particularly through NVIDIA's GPU-accelerated machine learning and deep learning platforms.
Brad Nemire
2 min read
Has Summary
--
The article discusses the various robotics sessions and events hosted at GTC, focusing on Jetson tutorials, AI applications in STEM, and commercial uses of AI in robotics.
Brad Nemire
2 min read
Has Summary
--
NVIDIA has introduced Isaac Gym, a physics simulation environment designed to accelerate reinforcement learning (RL) research by leveraging GPU technology.
Nefi Alarcon
4 min read
Has Summary
--
The article discusses the challenges of designing neural network architectures and introduces Unified Neural Architecture Search (UNAS), a framework that combines the strengths of differentiable an...
Arash Vahdat
8 min read
Includes Code
Has Summary
--
The article discusses the challenges of sample inefficiency in reinforcement learning and introduces Nonparametric Off-Policy Policy Gradient (NOPG) as a solution.
Samuele Tosatto
9 min read
Has Summary
--
Facebook researchers have developed a reinforcement learning model that excels in heads-up, no-limit Texas hold'em and turn endgame hold'em poker, outperforming human competitors.
Nefi Alarcon
2 min read
Has Summary
--