Papers by S Chaudhari

Explore 1 peer-reviewed study by S Chaudhari in Reinforcement Learning from Human Feedback (RLHF) and Large Language Models (2025). Discover research powered by Prolific's participant panel.

This page lists 1 peer-reviewed paper authored or co-authored by S Chaudhari in the Prolific Citations Library, a curated collection of research powered by high-quality human data from Prolific.

Papers (1 of 1)

RLHF deciphered: A critical analysis of reinforcement learning from human feedback for LLMs

Authors: S Chaudhari, P Aggarwal, V Murahari

Year: 2025

Published in: ACM Computing ..., 2025 - dl.acm.org

Institution: University of Massachusetts Amherst, Carnegie Mellon University, Princeton University

Research Area: Reinforcement Learning from Human Feedback (RLHF), Large Language Models

Discipline: Artificial Intelligence

The paper critically analyzes reinforcement learning from human feedback (RLHF) for large language models (LLMs), emphasizing the importance and limitations of reward models in improving human-aligned AI systems.

Methods: Analyzed RLHF frameworks through reinforcement learning principles; conducted a categorical literature review to identify modeling challenges, assumptions, and framework limitations.

Key Findings: Investigated RLHF's fundamentals, focusing on the role of reward models, implications of design choices in RLHF training algorithms, and underlying issues like generalization errors, model misspecification, and feedback sparsity.

Citations: 117