How OpenAI Uses RLHF
8 engineering articles about RLHF from OpenAI's engineering team
Other OpenAI Technologies
Other Companies Using RLHF
Articles
Filter:
The article introduces GPT-4. 5, OpenAI's latest and most advanced model for chat, highlighting its improvements in unsupervised learning, emotional intelligence, and practical applications.
The OpenAI GPT-4. 5 System Card provides insights into the latest advancements in OpenAI's language model, highlighting its capabilities, safety evaluations, and preparedness framework.
The article discusses a new alignment strategy called deliberative alignment, which teaches reasoning to language models to enhance their safety.
Melody Guan
8 min read
Has Summary
--
The article discusses the development and application of Rule-Based Rewards (RBRs) to enhance the safety behavior of AI models, reducing reliance on extensive human data collection.
The article introduces GPT-4o mini, OpenAI's most cost-efficient small model, designed to make AI intelligence more accessible and affordable.
The article discusses CriticGPT, a model based on GPT-4, designed to identify errors in ChatGPT responses.
Nat McAleese
5 min read
Includes Code
Has Summary
--
The article discusses advancements in training language models to better follow user instructions, specifically focusing on the InstructGPT models developed by OpenAI.
Ryan Lowe
12 min read
Has Summary
--
You've reached the end! All 8 articles loaded.