Mitigating Bias in Reinforcement Learning from Human Feedback for Large Language Models
Authors: C Ravulu, R Sarabu, M Suryadevara
Published: 2024
Publication: ... Conference on AI x ..., 2024 - ieeexplore.ieee.org
Research paper: Mitigating Bias in Reinforcement Learning from Human Feedback for Large Language Models
Institution: International Institute of Information Technology, University of California Santa Cruz, University of South Carolina Aiken
Research Area: Reinforcement Learning from Human Feedback (RLHF), Bias Mitigation, LLM, AI Bias
Discipline: Artificial Intelligence
DOI: https://ieeexplore.ieee.org/abstract/document/10990073/