Reinforcement Learning Programming Tutorials & Engineering Articles

94 Reinforcement Learning tutorials, guides, and engineering insights from NVIDIA, OpenAI, Uber, and more

Companies Using This

NVIDIA(45)

Reinforcement Learning Articles & Tutorials

Filter:

NVIDIA

Advanced

How to Train an AI Agent for Command-Line Tasks with Synthetic Data and Reinforcement Learning

This article explores how to train an AI agent to operate a new Command Line Interface (CLI) using synthetic data generation and reinforcement learning.

Hugging FaceJSONPythonReinforcement LearningRLHFShell

Chris Alexiuk

11 min read

Includes Code

Has Summary

NVIDIA

Intermediate

Inside NVIDIA Nemotron 3: Techniques, Tools, and Data That Make It Efficient and Accurate

The article discusses the NVIDIA Nemotron 3, a family of open models designed for agentic AI systems, emphasizing its efficiency and accuracy through innovative architectures and techniques.

Hugging FaceLarge Language ModelsReinforcement LearningTransformer

Chris Alexiuk

9 min read

Has Summary

NVIDIA

Intermediate

How to Train Scientific Agents with Reinforcement Learning

The article discusses the development of scientific AI agents using reinforcement learning (RL) techniques, specifically through the NVIDIA NeMo framework.

ApacheAzurePythonReinforcement LearningRLHF

Christian Munley

12 min read

Includes Code

Has Summary

NVIDIA

Intermediate

Breaking Through Reinforcement Learning Training Limits with Scaling Rollouts in BroRL

The article introduces Broadened Reinforcement Learning (BroRL), a new paradigm that enhances the training of large language models (LLMs) by focusing on rollout scaling rather than just increasing...

Hugging FaceReinforcement Learning

Jian Hu

6 min read

Has Summary

Netflix

Advanced

Post-Training Generative Recommenders with Advantage-Weighted Supervised Finetuning

Netflix introduces Advantage-Weighted Supervised Fine-Tuning (A-SFT), a novel post-training algorithm for generative recommender systems that addresses the unique challenges of applying reinforceme...

Fine-tuningReinforcement LearningRLHF

Netflix Technology Blog

12 min read

Has Summary

Google

Advanced

Introducing Tunix: A JAX-Native Library for LLM Post-Training

The article introduces Tunix, a new open-source, JAX-native library designed for post-training of large language models (LLMs).

FlaxGoogle CloudJAXReinforcement LearningRLHF

Srikanth Kilaru, Tianshu Bao

7 min read

Includes Code

Has Summary

NVIDIA

Intermediate

Train a Quadruped Locomotion Policy and Simulate Cloth Manipulation with NVIDIA Isaac Lab and Newton

This article discusses the integration of the Newton physics engine with NVIDIA Isaac Lab for training quadruped locomotion policies and simulating cloth manipulation.

ApacheNumPyPythonPyTorchReinforcement LearningWarpYAML

Mohammad Mohajerani

13 min read

Includes Code

Has Summary

NVIDIA

Intermediate

Reinforcement Learning with NVIDIA NeMo-RL: Megatron-Core Support for Optimized Training Throughput

The article discusses the enhancements in reinforcement learning training throughput using NVIDIA NeMo-RL with Megatron-Core support.

PyTorchReinforcement LearningYAML

Anna Shors

7 min read

Includes Code

Has Summary

NVIDIA

Advanced

Scaling LLM Reinforcement Learning with Prolonged Training Using ProRL v2

The article discusses the advancements in reinforcement learning for large language models (LLMs) through the introduction of ProRL v2 by NVIDIA Research.

Hugging FaceReinforcement Learning

Jian Hu

7 min read

Includes Code

Has Summary

NVIDIA

Advanced

Build More Accurate and Efficient AI Agents with the New NVIDIA Llama Nemotron Super v1.5

The article discusses the release of the NVIDIA Llama Nemotron Super 49B v1. 5, highlighting its advancements in accuracy, efficiency, and reasoning capabilities for AI agents.

Hugging FaceOpenAI APIReinforcement Learning

Chris Alexiuk

5 min read

Has Summary

NVIDIA

Advanced

Reinforcement Learning with NVIDIA NeMo-RL: Reproducing a DeepScaleR Recipe Using GRPO

The article introduces NVIDIA NeMo-RL, an open-source library for reinforcement learning that supports scalable training from single-GPU to thousand-GPU models.

Hugging FacePythonPyTorchReinforcement Learning

Alexander Bukharin

5 min read

Includes Code

Has Summary

Uber

Advanced

Reinforcement Learning for Modeling Marketplace Balance

This article discusses how Uber utilizes reinforcement learning techniques to enhance the efficiency of its marketplace by improving the balance between drivers and demand.

Reinforcement Learning

Prateek Jain, Soheil Sadeghi, Mehrdad Bakhtiari

11 min read

Has Summary

NVIDIA

Advanced

Advancing Agentic AI with NVIDIA Nemotron Open Reasoning Models

The article discusses the advancements in AI autonomy through NVIDIA's Nemotron open reasoning models, which enhance AI agents' decision-making capabilities in complex environments.

Hugging FaceMistralReinforcement LearningTransformer

Nirmal Kumar Juluru

6 min read

Has Summary

NVIDIA

Advanced

Build Enterprise AI Agents with Advanced Open NVIDIA Llama Nemotron Reasoning Models

The article discusses the development and capabilities of NVIDIA's Llama Nemotron reasoning models, which enhance AI agents' reasoning abilities for complex problem-solving in various industries.

Hugging FaceReinforcement LearningRLHF

Chris Alexiuk

11 min read

Has Summary

NVIDIA

Intermediate

Scale Synthetic Data and Physical AI Reasoning with NVIDIA Cosmos World Foundation Models

The article discusses how NVIDIA Cosmos World Foundation Models (WFMs) enhance the development of AI-driven robots and autonomous vehicles by providing high-fidelity, physics-aware synthetic data.

Hugging FaceJSONReinforcement LearningTransformer

Pranjali Joshi

7 min read

Includes Code

Has Summary

Google

Intermediate

Introducing Gemma 3: The Developer Guide

Gemma 3 is the latest version of the Gemma open-model family, boasting enhanced capabilities such as multimodality, longer context windows, and improved reasoning.

Hugging FaceJAXOllamaReinforcement LearningRLHFTransformersVertex AI

Omar Sanseviero, Philipp Schmid

5 min read

Includes Code

Has Summary

OpenAI

Intermediate

Deliberative alignment: reasoning enables safer language models

The article discusses a new alignment strategy called deliberative alignment, which teaches reasoning to language models to enhance their safety.

ClaudeConstitutional AIGeminiGPTReinforcement LearningRLHF

Melody Guan

8 min read

Has Summary

Advanced

How we built domain-adapted foundation GenAI models to power our platform

The article discusses the development of domain-adapted foundation GenAI models at LinkedIn, focusing on their application within the Economic Opportunity Network (EON) project.

AzureGenerative AIGPTGPT-4KubernetesMistralReinforcement LearningRetrieval Augmented GenerationRLHF

Praveen Kumar Bodigutla

12 min read

Has Summary

OpenAI

Beginner

Advancing red teaming with people and AI

The article discusses advancements in red teaming methodologies at OpenAI, focusing on the integration of human and AI efforts to identify potential risks in AI systems.

GPTReinforcement Learning

OpenAI

8 min read

Has Summary

Airbnb

Advanced

Transforming Location Retrieval at Airbnb: A Journey from Heuristics to Reinforcement Learning

The article discusses Airbnb's evolution in location retrieval from simple heuristics to advanced machine learning and reinforcement learning techniques.

Machine LearningReinforcement Learning

Dillon Davis

9 min read

Has Summary

Netflix

Intermediate

Recommending for Long-Term Member Satisfaction at Netflix

The article discusses Netflix's approach to enhancing long-term member satisfaction through its recommendation algorithms.

Reinforcement Learning

Netflix Technology Blog

9 min read

Has Summary

Intermediate

Building Pinterest Canvas, a text-to-image foundation model

The article discusses the development of Pinterest Canvas, a text-to-image foundation model aimed at enhancing existing images and products on the Pinterest platform.

CLIPDiffusion ModelsEmbeddingFine-tuningReinforcement LearningTransformer

Pinterest Engineering

10 min read

Has Summary

OpenAI

Advanced

Finding GPT-4’s mistakes with GPT-4

The article discusses CriticGPT, a model based on GPT-4, designed to identify errors in ChatGPT responses.

GPTGPT-4Reinforcement LearningRLHF

Nat McAleese

5 min read

Includes Code

Has Summary

NVIDIA

Advanced

Closing the Sim-to-Real Gap: Training Spot Quadruped Locomotion with NVIDIA Isaac Lab

The article discusses the challenges and methodologies involved in training quadruped locomotion policies using NVIDIA Isaac Lab, emphasizing the importance of high-fidelity simulation for bridging...

PythonReinforcement Learning

Oyindamola Omotuyi

11 min read

Includes Code

Has Summary

NVIDIA

Advanced

Supercharge Robotics Workflows with AI and Simulation Using NVIDIA Isaac Sim 4.0 and NVIDIA Isaac Lab

The article discusses the advancements in robotics workflows through the latest release of NVIDIA Isaac Sim 4. 0 and NVIDIA Isaac Lab.

PythonPyTorchReinforcement Learning

Akhil Docca

10 min read

Has Summary

NVIDIA

Advanced

Enabling Quantum Computing with AI

The article discusses the integration of AI in enabling practical quantum computing by addressing challenges in quantum processors, error correction, and algorithm development.

GPTReinforcement LearningTransformer

Mark Wolf

6 min read

Has Summary

NVIDIA

Advanced

Enhance Text-to-Image Fine-Tuning with DRaFT+, Now Part of NVIDIA NeMo

The article introduces DRaFT+, an enhanced algorithm for fine-tuning text-to-image diffusion models, which aims to improve the alignment between input prompts and generated images.

CLIPDiffusion ModelsFine-tuningReinforcement LearningRLHFStable Diffusion

Ali Taghibakhshi

9 min read

Includes Code

Has Summary

Fly.io

Intermediate

How Yoko Li makes towns, tamagoes, and tools for local AI

The article discusses Yoko Li's innovative work in AI, focusing on her projects like AI Town and AI Tamago, which utilize emergent behavior and large language models.

JSONLLaMAMidjourneyOllamaReinforcement Learning

Xe Iaso

10 min read

Has Summary

Palantir

Intermediate

Building with Palantir AIP: Semantic Search

The article discusses how to leverage Palantir AIP to build a semantic search application that uncovers insights from unstructured data within enterprises.

ApacheApache SparkReinforcement LearningRetrieval Augmented GenerationRLHFSemantic Search

Palantir

6 min read

Includes Code

Has Summary

NVIDIA

Advanced

Streamline Generative AI Development with NVIDIA NeMo on GPU-Accelerated Google Cloud

The article discusses how NVIDIA NeMo can streamline the development of generative AI applications on GPU-accelerated Google Cloud.

BERTDaskFine-tuningGenerative AIGoogle CloudGPTHugging FacePythonRedisReinforcement LearningT5Transformer

Chintan Patel

9 min read

Has Summary

NVIDIA

Advanced

QHack Results Highlight Quantum Computing Applications and Tools on GPUs

QHack 2023 showcased the intersection of quantum computing and machine learning, featuring 2,850 participants from 105 countries competing to develop innovative solutions using NVIDIA's quantum tec...

ChatGPTChiReinforcement Learning

Tom Lubowe

8 min read

Has Summary

NVIDIA

Advanced

AutoDMP Optimizes Macro Placement for Chip Design with AI and GPUs

The article discusses how AutoDMP leverages AI and GPU technology to optimize macro placement in chip design, significantly improving performance and efficiency.

PyTorchReinforcement LearningV

Anthony Agnesina

10 min read

Has Summary

NVIDIA

Intermediate

Year In Review: Trending Posts of 2022

The article summarizes the advancements and AI-powered solutions introduced in 2022, highlighting the most popular posts on the NVIDIA Technical Blog.

Diffusion ModelsReinforcement Learning

Michelle Horton

3 min read

Includes Code

Has Summary

NVIDIA

Beginner

Reinforcing the Value of Simulation by Teaching Dexterity to a Real Robot Hand

The article discusses the DeXtreme project, which utilizes simulation to teach dexterity to a real robot hand.

Reinforcement Learning

Gavriel State

7 min read

Has Summary

Netflix

Advanced

Reinforcement Learning for Budget Constrained Recommendations

This article discusses the application of reinforcement learning to create optimal recommendation systems that consider users' time budgets.

Reinforcement Learning

Netflix Technology Blog

13 min read

Has Summary

NVIDIA

Advanced

Advancing Robotic Assembly with a Novel Simulation Approach Using NVIDIA Isaac

NVIDIA researchers introduced Factory, a novel simulation approach designed to enhance robotic assembly by enabling real-time, accurate simulations of contact-rich interactions.

AssemblyReinforcement Learning

Oyindamola Omotuyi

10 min read

Has Summary

NVIDIA

Intermediate

Designing Arithmetic Circuits with Deep Reinforcement Learning

The article discusses the innovative use of deep reinforcement learning (RL) to design arithmetic circuits, particularly in the context of NVIDIA GPUs.

RedisReinforcement Learning

Rajarshi Roy

8 min read

Includes Code

Has Summary

NVIDIA

Advanced

The Full Stack Optimization Powering NVIDIA MLPerf Training v2.0 Performance

The article discusses NVIDIA's advancements in MLPerf Training v2. 0, highlighting the full-stack optimizations that enhance performance across various AI workloads.

BERTNatural Language ProcessingPythonPyTorchReinforcement LearningResNetTransformersU-Net

Ashraf Eassa

14 min read

Includes Code

Has Summary

NVIDIA

Advanced

Boosting NVIDIA MLPerf Training v1.1 Performance with Full Stack Optimization

The article discusses the performance improvements achieved in the NVIDIA MLPerf Training v1. 1 benchmark through full stack optimization.

AzureBERTPyTorchReinforcement LearningResNetU-Net

Vinh Nguyen

21 min read

Includes Code

Has Summary

NVIDIA

Intermediate

NVIDIA Launches Updated Teaching Kit for Edge AI and Robotics Educators

NVIDIA has released an updated Edge AI and Robotics Teaching Kit aimed at university educators, developed in collaboration with experts from the University of Oxford and the University of Maryland,...

Deep LearningNeural NetworksPyTorchReinforcement LearningSpring

Jason Black

3 min read

Has Summary

NVIDIA

Advanced

NVIDIA Research: Transferring Dexterous Manipulation from GPU Simulation to a Remote, Real-World,

The article discusses NVIDIA's research on transferring dexterous manipulation capabilities from GPU simulation to real-world robotic applications.

Reinforcement Learning

Varun Lodaya

11 min read

Has Summary

Airbnb

Advanced

Task-Oriented Conversational AI in Airbnb Customer Support

This article discusses how Airbnb utilizes task-oriented conversational AI to enhance customer support for hosts and guests.

Machine LearningPyTorchReinforcement LearningRoBERTaTransfer LearningTransformers

Gavin Li

20 min read

Has Summary

NVIDIA

Advanced

MLPerf v1.0 Training Benchmarks: Insights into a Record-Setting NVIDIA Performance

The article discusses the MLPerf v1. 0 training benchmarks, highlighting NVIDIA's record-setting performance across various AI workloads.

BERTLSTMPyTorchReinforcement LearningResNetU-Net

Vinh Nguyen

30 min read

Includes Code

Has Summary

NVIDIA

Beginner

Develop Robotics Applications - Top Resources from GTC 21

The article discusses the Isaac SDK and Isaac Sim, NVIDIA's robotics platform designed to accelerate the development of robotics applications through GPU optimization for AI and computer vision.

Deep LearningReinforcement Learning

Brad Nemire

3 min read

Has Summary

NVIDIA

Advanced

Fraud Detection - Top Resources from GTC 21

The article discusses how AI can enhance fraud detection and prevention in banking, particularly through NVIDIA's GPU-accelerated machine learning and deep learning platforms.

Deep LearningGoogle CloudMachine LearningReinforcement Learning

Brad Nemire

2 min read

Has Summary

NVIDIA

Intermediate

Robotics at GTC: Jetson tutorials, AI in STEM, and Commercial Apps

The article discusses the various robotics sessions and events hosted at GTC, focusing on Jetson tutorials, AI applications in STEM, and commercial uses of AI in robotics.

Deep LearningReinforcement Learning

Brad Nemire

2 min read

Has Summary

NVIDIA

Advanced

Introducing NVIDIA Isaac Gym: End-to-End Reinforcement Learning for Robotics

NVIDIA has introduced Isaac Gym, a physics simulation environment designed to accelerate reinforcement learning (RL) research by leveraging GPU technology.

PyTorchReinforcement LearningTensorFlow

Nefi Alarcon

4 min read

Has Summary

NVIDIA

Intermediate

Discovering GPU-friendly Deep Neural Networks with Unified Neural Architecture Search

The article discusses the challenges of designing neural network architectures and introduces Unified Neural Architecture Search (UNAS), a framework that combines the strengths of differentiable an...

Neural NetworksPyTorchReinforcement Learning

Arash Vahdat

8 min read

Includes Code

Has Summary

NVIDIA

Advanced

Enhancing Sample Efficiency in Reinforcement Learning with Nonparametric Methods

The article discusses the challenges of sample inefficiency in reinforcement learning and introduces Nonparametric Off-Policy Policy Gradient (NOPG) as a solution.

Artificial IntelligencePyTorchReinforcement LearningTensorFlowV

Samuele Tosatto

9 min read

Has Summary

NVIDIA

Advanced

Facebook’s AI Model Outmatches Competitors in Poker

Facebook researchers have developed a reinforcement learning model that excels in heads-up, no-limit Texas hold'em and turn endgame hold'em poker, outperforming human competitors.

PyTorchReinforcement Learning

Nefi Alarcon

2 min read

Has Summary