Reinforcement Learning from Interactive Human Feedback: A Comprehensive Survey
Reinforcement learning from human feedback (RLHF) is a powerful approach that learns agent behavior by incorporating interactive human feedback, overcoming the limitations of manually engineered reward functions.