OpenAI logo

How OpenAI Uses RLHF

8 engineering articles about RLHF from OpenAI's engineering team

Articles

Filter:
OpenAI logo
OpenAI
Advanced
The article introduces GPT-4. 5, OpenAI's latest and most advanced model for chat, highlighting its improvements in unsupervised learning, emotional intelligence, and practical applications.
OpenAI
12 min read
Has Summary
--
OpenAI logo
OpenAI
Intermediate
The OpenAI GPT-4. 5 System Card provides insights into the latest advancements in OpenAI's language model, highlighting its capabilities, safety evaluations, and preparedness framework.
OpenAI
2 min read
Has Summary
--
OpenAI logo
OpenAI
Intermediate
The article discusses a new alignment strategy called deliberative alignment, which teaches reasoning to language models to enhance their safety.
OpenAI logo
OpenAI
Advanced
The article discusses the development and application of Rule-Based Rewards (RBRs) to enhance the safety behavior of AI models, reducing reliance on extensive human data collection.
Tong Mu
9 min read
Has Summary
--
OpenAI logo
OpenAI
Intermediate
The article introduces GPT-4o mini, OpenAI's most cost-efficient small model, designed to make AI intelligence more accessible and affordable.
OpenAI
6 min read
Has Summary
--
OpenAI logo
OpenAI
Advanced
The article discusses CriticGPT, a model based on GPT-4, designed to identify errors in ChatGPT responses.
Nat McAleese
5 min read
Includes Code
Has Summary
--
OpenAI logo
OpenAI
Advanced
GPT-4 is the latest milestone in OpenAI's deep learning efforts, showcasing a large multimodal model that accepts both image and text inputs.
OpenAI
15 min read
Has Summary
--
OpenAI logo
OpenAI
Intermediate
The article discusses advancements in training language models to better follow user instructions, specifically focusing on the InstructGPT models developed by OpenAI.
Ryan Lowe
12 min read
Has Summary
--

You've reached the end! All 8 articles loaded.