Authors: GKM Liu
Year: 2024
Published in: Massachusetts Institute of Technology, 2023 - computing.mit.edu
Institution: Massachusetts Institute of Technology
Research Area: Reinforcement Learning with Human Feedback (RLHF), Human-AI Interaction
Discipline: Artificial Intelligence
The paper explores Reinforcement Learning with Human Feedback (RLHF) as a transformative tool to align AI with human values, mitigate bias, and democratize technology, while emphasizing its societal implications and ethical considerations.
Methods: The paper employs a systematic study of existing and potential societal effects of RLHF, guided by key questions addressing ethical, social, and practical impacts.
Key Findings: The study investigates how RLHF affects information integrity, societal values, social equity, access to AI, cultural relations, industrial transformation, and labor dynamics.
Citations: 17
Authors: J Dai, X Pan, R Sun, J Ji, X Xu, M Liu, Y Wang
Year: 2023
Published in: arXiv preprint arXiv ..., 2023 - arxiv.org
Institution: Cornell University, Georgia Institute of Technology
Research Area: Reinforcement Learning from Human Feedback (RLHF), Safe AI, Reinforcement Learning
Discipline: Artificial Intelligence, Machine Learning
DOI: https://doi.org/10.48550/arXiv.2310.12773
Citations: 598