RLHF Programming Tutorials & Engineering Articles

46 RLHF tutorials, guides, and engineering insights from NVIDIA, OpenAI, and Google

Companies Using This

NVIDIA(33)

OpenAI(8)

Google(2)

RLHF Articles & Tutorials

Filter:

NVIDIA

Advanced

How to Train an AI Agent for Command-Line Tasks with Synthetic Data and Reinforcement Learning

This article explores how to train an AI agent to operate a new Command Line Interface (CLI) using synthetic data generation and reinforcement learning.

Hugging FaceJSONPythonReinforcement LearningRLHFShell

Chris Alexiuk

11 min read

Includes Code

Has Summary

NVIDIA

Advanced

Inside the NVIDIA Rubin Platform: Six New Chips, One AI Supercomputer

The article discusses the NVIDIA Rubin platform, which introduces six new chips designed to create a powerful AI supercomputer.

AssemblyHugging FaceJAXKubernetesLessPyTorchRLHFTransformer

Kyle Aubrey

59 min read

Has Summary

NVIDIA

Intermediate

How to Train Scientific Agents with Reinforcement Learning

The article discusses the development of scientific AI agents using reinforcement learning (RL) techniques, specifically through the NVIDIA NeMo framework.

ApacheAzurePythonReinforcement LearningRLHF

Christian Munley

12 min read

Includes Code

Has Summary

NVIDIA

Intermediate

Pioneering AI Co-Scientists for Fusion Research and Cancer Treatment

The article discusses the innovative use of AI co-scientists in scientific research, specifically focusing on fusion research and cancer treatment.

RLHF

Geetika Gupta

8 min read

Has Summary

Netflix

Advanced

Post-Training Generative Recommenders with Advantage-Weighted Supervised Finetuning

Netflix introduces Advantage-Weighted Supervised Fine-Tuning (A-SFT), a novel post-training algorithm for generative recommender systems that addresses the unique challenges of applying reinforceme...

Fine-tuningReinforcement LearningRLHF

Netflix Technology Blog

12 min read

Has Summary

Google

Advanced

Introducing Tunix: A JAX-Native Library for LLM Post-Training

The article introduces Tunix, a new open-source, JAX-native library designed for post-training of large language models (LLMs).

FlaxGoogle CloudJAXReinforcement LearningRLHF

Srikanth Kilaru, Tianshu Bao

7 min read

Includes Code

Has Summary

NVIDIA

Intermediate

Build Custom Reasoning Models with Advanced, Open Post-Training Datasets

The article discusses the use of synthetic data in post-training procedures for large language models (LLMs) and highlights NVIDIA's open-sourcing of the Llama-Nemotron post-training dataset, which...

Hugging FaceRLHF

Vinh Nguyen

5 min read

Has Summary

NVIDIA

Advanced

Build Enterprise AI Agents with Advanced Open NVIDIA Llama Nemotron Reasoning Models

The article discusses the development and capabilities of NVIDIA's Llama Nemotron reasoning models, which enhance AI agents' reasoning abilities for complex problem-solving in various industries.

Hugging FaceReinforcement LearningRLHF

Chris Alexiuk

11 min read

Has Summary

Google

Intermediate

Introducing Gemma 3: The Developer Guide

Gemma 3 is the latest version of the Gemma open-model family, boasting enhanced capabilities such as multimodality, longer context windows, and improved reasoning.

Hugging FaceJAXOllamaReinforcement LearningRLHFTransformersVertex AI

Omar Sanseviero, Philipp Schmid

5 min read

Includes Code

Has Summary

OpenAI

Advanced

Introducing GPT-4.5

The article introduces GPT-4. 5, OpenAI's latest and most advanced model for chat, highlighting its improvements in unsupervised learning, emotional intelligence, and practical applications.

AzureGPTGPT-4RLHF

OpenAI

12 min read

Has Summary

OpenAI

Intermediate

OpenAI GPT-4.5 System Card

The OpenAI GPT-4. 5 System Card provides insights into the latest advancements in OpenAI's language model, highlighting its capabilities, safety evaluations, and preparedness framework.

GPTGPT-4RLHF

OpenAI

2 min read

Has Summary

NVIDIA

Advanced

Llama Nemotron Models Accelerate Agentic AI Workflows with Accuracy and Efficiency

The article discusses the Llama Nemotron models, which enhance Agentic AI workflows by integrating large language models with advanced reasoning and planning capabilities.

Hugging FaceRLHF

Chintan Patel

7 min read

Has Summary

OpenAI

Intermediate

Deliberative alignment: reasoning enables safer language models

The article discusses a new alignment strategy called deliberative alignment, which teaches reasoning to language models to enhance their safety.

ClaudeConstitutional AIGeminiGPTReinforcement LearningRLHF

Melody Guan

8 min read

Has Summary

Advanced

How we built domain-adapted foundation GenAI models to power our platform

The article discusses the development of domain-adapted foundation GenAI models at LinkedIn, focusing on their application within the Economic Opportunity Network (EON) project.

AzureGenerative AIGPTGPT-4KubernetesMistralReinforcement LearningRetrieval Augmented GenerationRLHF

Praveen Kumar Bodigutla

12 min read

Has Summary

NVIDIA

Intermediate

New Reward Model Helps Improve LLM Alignment with Human Preferences

The article discusses the development of a new reward model, Llama 3.

ChatGPTClaudeHugging FaceRLHF

Zhilin Wang

3 min read

Has Summary

NVIDIA

Advanced

Deploying Accelerated Llama 3.2 from the Edge to the Cloud

The article discusses the deployment of the Llama 3.

Hugging FaceRLHF

Anjali Shah

6 min read

Has Summary

NVIDIA

Advanced

Optimizing Data Center Performance with AI Agents and the OODA Loop Strategy

The article discusses how NVIDIA optimizes data center performance using AI agents and the OODA loop strategy.

ElasticsearchKubernetesLangChainPythonRLHFSQL

Aaron Erickson

11 min read

Has Summary

NVIDIA

Advanced

Power Text-Generation Applications with Mistral NeMo 12B Running on a Single GPU

The article discusses the Mistral NeMo 12B model, a next-generation language model developed by NVIDIA and Mistral, designed for high performance on a single GPU.

ApacheArtificial IntelligenceEmbeddingMistralPyTorchRLHFTransformer

Anjali Shah

6 min read

Includes Code

Has Summary

NVIDIA

Advanced

Supercharging Llama 3.1 across NVIDIA Platforms

The article discusses the launch of Meta's Llama 3.

LangChainLlamaIndexRLHF

Anjali Shah

8 min read

Has Summary

NVIDIA

Intermediate

Customize Generative AI Models for Enterprise Applications with Llama 3.1

The article discusses the Llama 3. 1 collection of large language models (LLMs) and their applications in enterprise settings.

Generative AIHTMLKubernetesLangChainLLaMALlamaIndexRLHF

Chintan Patel

10 min read

Includes Code

Has Summary

OpenAI

Advanced

Improving Model Safety Behavior with Rule-Based Rewards

The article discusses the development and application of Rule-Based Rewards (RBRs) to enhance the safety behavior of AI models, reducing reliance on extensive human data collection.

GPTRLHF

Tong Mu

9 min read

Has Summary

OpenAI

Intermediate

GPT-4o mini: advancing cost-efficient intelligence

The article introduces GPT-4o mini, OpenAI's most cost-efficient small model, designed to make AI intelligence more accessible and affordable.

ClaudeGeminiGPTRLHF

OpenAI

6 min read

Has Summary

OpenAI

Advanced

Finding GPT-4’s mistakes with GPT-4

The article discusses CriticGPT, a model based on GPT-4, designed to identify errors in ChatGPT responses.

GPTGPT-4Reinforcement LearningRLHF

Nat McAleese

5 min read

Includes Code

Has Summary

NVIDIA

Intermediate

NVIDIA Sets New Generative AI Performance and Scale Records in MLPerf Training v4.0

NVIDIA has achieved new generative AI performance records in MLPerf Training v4. 0, showcasing significant advancements in training large language models (LLMs) and graph neural networks (GNNs).

BERTGenerative AIGPTResNetRLHFStable DiffusionTransformerTransformersU-Net

Ashraf Eassa

10 min read

Has Summary

NVIDIA

Advanced

Enhance Text-to-Image Fine-Tuning with DRaFT+, Now Part of NVIDIA NeMo

The article introduces DRaFT+, an enhanced algorithm for fine-tuning text-to-image diffusion models, which aims to improve the alignment between input prompts and generated images.

CLIPDiffusion ModelsFine-tuningReinforcement LearningRLHFStable Diffusion

Ali Taghibakhshi

9 min read

Includes Code

Has Summary

NVIDIA

Advanced

Fine-Tune and Align LLMs Easily with NVIDIA NeMo Customizer

The article discusses NVIDIA NeMo Customizer, a microservice designed to simplify the fine-tuning and alignment of large language models (LLMs) for enterprise AI applications.

KubernetesLSTMRLHF

Nirmal Kumar Juluru

5 min read

Has Summary

NVIDIA

Advanced

Simplify Custom Generative AI Development with NVIDIA NeMo Microservices

The article discusses the NVIDIA NeMo microservices, which simplify the development of custom generative AI models for enterprises.

Fine-tuningGenerative AIKubernetesMicroservicesRLHF

Nirmal Kumar Juluru

5 min read

Has Summary

NVIDIA

Intermediate

Unlock Your LLM Coding Potential with StarCoder2

The article discusses StarCoder2, an advanced large language model (LLM) designed to enhance coding efficiency for developers.

GitPythonRLHFStable Diffusion

Chia-Chih Chen

7 min read

Includes Code

Has Summary

NVIDIA

Advanced

NVIDIA TensorRT-LLM Revs Up Inference for Google Gemma

NVIDIA collaborates with Google to enhance inference performance for the Gemma models using TensorRT-LLM, facilitating easier development with large language models (LLMs) on NVIDIA RTX GPUs.

GeminiHugging FacePythonRLHF

Anjali Shah

4 min read

Has Summary

NVIDIA

Intermediate

Accelerating Inference on End-to-End Workflows with H2O.ai and NVIDIA

The article discusses the collaboration between H2O. ai and NVIDIA to enhance AI applications in financial services through generative AI and predictive analytics.

AutoMLGenerative AIGPTH2O.aiRLHFXGBoost

Prabhu Ramamoorthy

13 min read

Has Summary

NVIDIA

Advanced

New NVIDIA NeMo Framework Features and NVIDIA H200 Supercharge LLM Training Performance and Versatility

The article discusses the latest features of the NVIDIA NeMo framework and the performance enhancements brought by the NVIDIA H200 GPUs, which significantly improve the training of large language m...

GPTJAXPyTorchRLHF

Ashraf Eassa

9 min read

Has Summary

NVIDIA

Advanced

New Risk Calculation Record in Financial Services with Dell Technologies and NVIDIA H100 System for HPC and AI

Dell Technologies and NVIDIA have collaborated to set new records in financial risk calculations using the NVIDIA H100 system for high-performance computing (HPC) and AI.

ChatGPTFortranGenerative AIGPTLSTMRLHF

Prabhu Ramamoorthy

7 min read

Has Summary

Palantir

Intermediate

Building with Palantir AIP: Semantic Search

The article discusses how to leverage Palantir AIP to build a semantic search application that uncovers insights from unstructured data within enterprises.

ApacheApache SparkReinforcement LearningRetrieval Augmented GenerationRLHFSemantic Search

Palantir

6 min read

Includes Code

Has Summary

NVIDIA

Intermediate

Mastering LLM Techniques: LLMOps

The article discusses the evolution of machine learning operations (MLOps) into specialized areas such as GenAIOps and LLMOps, focusing on the development and management of generative AI and large ...

ChatGPTEmbeddingRetrieval Augmented GenerationRLHF

Nik Spirin

13 min read

Has Summary

NVIDIA

Advanced

NVIDIA AI Foundation Models: Build Custom Enterprise Chatbots and Co-Pilots with Production-Ready

The article discusses NVIDIA's AI Foundation Models, specifically the Nemotron-3 8B family, which enables the creation of custom enterprise chatbots and co-pilots with production-ready capabilities.

AzureHugging FaceKubernetesMachine LearningRLHF

Vivienne Zhang

12 min read

Includes Code

Has Summary

NVIDIA

Intermediate

Build Custom Enterprise-Grade Generative AI with NVIDIA AI Foundation Models

The article discusses how to build custom enterprise-grade generative AI applications using NVIDIA's AI Foundation Models.

ChefCLIPFine-tuningGenerative AIHugging FaceJavaMistralPythonRLHFSQLStable Diffusion

Nirmal Kumar Juluru

7 min read

Includes Code

Has Summary

NVIDIA

Intermediate

Getting Started with Large Language Models for Enterprise Solutions

The article discusses the application of Large Language Models (LLMs) in enterprise solutions, highlighting their capabilities in enhancing productivity across various industries.

ChatGPTEmbeddingGenerative AIGoogle CloudGPTLarge Language ModelsMistralRetrieval Augmented GenerationRLHFStable Diffusion

Erik Pounds

13 min read

Has Summary

NVIDIA

Intermediate

Bringing Generative AI to Life with NVIDIA Jetson

NVIDIA has introduced the Jetson Generative AI Lab, enabling developers to leverage generative AI capabilities on Jetson edge devices.

CLIPGenerative AIGitHub ActionsGPTGPT-4GradioHugging FaceModalOobaboogaRLHFSegment Anything ModelStable DiffusionTransformers

Chitoku Yato

9 min read

Includes Code

Has Summary

NVIDIA

Advanced

Announcing NVIDIA SteerLM: A Simple and Practical Technique to Customize LLMs During Inference

NVIDIA SteerLM is a novel technique designed to simplify the customization of large language models (LLMs) during inference.

EmotionGPTLLaMAPaLMPythonRLHF

Yi Dong

10 min read

Includes Code

Has Summary

NVIDIA

Intermediate

Mastering LLM Techniques: Customization

The article discusses various techniques for customizing Large Language Models (LLMs) to better fit enterprise needs, emphasizing the importance of tailoring language processing capabilities for sp...

Fine-tuningGPTLarge Language ModelsLSTMPrompt EngineeringRLHFTransfer Learning

Anjali Shah

11 min read

Includes Code

Has Summary

NVIDIA

Advanced

Unlocking the Power of Enterprise-Ready LLMs with NVIDIA NeMo

The article discusses NVIDIA NeMo, an end-to-end platform designed to facilitate the development and deployment of enterprise-ready large language models (LLMs).

ChatGPTDaskEmbeddingHugging FaceRedisRLHF

Amanda Saunders

9 min read

Has Summary

NVIDIA

Intermediate

Generative AI Sparks Life into Virtual Characters with NVIDIA ACE for Games

Generative AI technologies are transforming the creation and interaction of non-playable characters (NPCs) in games, enabling developers to create more intelligent and dynamic gaming experiences.

Generative AILangChainRLHF

Ike Nnoli

5 min read

Has Summary

NVIDIA

Beginner

NVIDIA Enables Trustworthy, Safe, and Secure Large Language Model Conversational Systems

NVIDIA introduces NeMo Guardrails, an open-source toolkit designed to create safe and trustworthy large language model (LLM) conversational systems.

ChatGPTLangChainPythonRLHF

Annamalai Chockalingam

7 min read

Includes Code

Has Summary

NVIDIA

Intermediate

NVIDIA Announces Generative AI Services for Language, Visual Content, and Biology Applications

NVIDIA has introduced generative AI services aimed at enhancing language, visual content, and biology applications.

BERTCLIPDeep LearningGenerative AIGPTLarge Language ModelsNatural Language ProcessingRLHFStable DiffusionT5

Annamalai Chockalingam

5 min read

Has Summary

OpenAI

Advanced

GPT-4

GPT-4 is the latest milestone in OpenAI's deep learning efforts, showcasing a large multimodal model that accepts both image and text inputs.

AzureGPTGPT-4PaLMRLHFTransformers

OpenAI

15 min read

Has Summary

OpenAI

Intermediate

Aligning language models to follow instructions

The article discusses advancements in training language models to better follow user instructions, specifically focusing on the InstructGPT models developed by OpenAI.

GPTLarge Language ModelsOpenAI APIRLHF

Ryan Lowe

12 min read

Has Summary

You've reached the end! All 46 articles loaded.