RLHF

Full Form: Reinforcement Learning from Human Feedback

Category: AI Techniques

📖 Definition

RLHF trains AI models using human preference data. Humans rate AI outputs, and the model learns to produce outputs that humans prefer. This improves alignment with human values.

🔑 Key Points

💡 Why It Matters

RLHF is how AI labs make their models more helpful and aligned with human values. It's crucial for creating AI that people want to use.