A survey of reinforcement learning from human feedback
Authors: T Kaufmann, P Weng, V Bengs, E Hüllermeier
Published: 2024
Publication: 2024 - epub.ub.uni-muenchen.de
This paper surveys the fundamentals, diverse applications, and evolving impact of reinforcement learning from human feedback (RLHF), emphasizing its role in improving intelligent system alignment and performance.
Methods: The paper utilizes a survey-based approach to synthesize existing research, exploring the interactions between reinforcement learning algorithms and human input.
Key Findings: The study examines the principles, dynamics, applications, and trends in RLHF, offering insights into its role in enhancing large language models (LLMs) and intelligent systems.
Limitations: The paper does not include experimental evaluations or new primary data, focusing primarily on a broad theoretical and literature-based overview of RLHF.
Institution: Paderborn University, German Research Center for Artificial Intelligence (DFKI), Duke Kunshan University
Research Area: Reinforcement Learning from Human Feedback (RLHF), LLM, Reward Modeling
Discipline: Artificial Intelligence
Citations: 354