Browse 7 peer-reviewed papers from Princeton University spanning LLM, Reinforcement Learning from Human Feedback (RLHF) (2022–2025). Research powered by Prolific's high-quality participant data.
This page lists 7 peer-reviewed papers from researchers at Princeton University in the Prolific Citations Library, a curated collection of research powered by high-quality human data from Prolific.
-
Authors: S Chaudhari, P Aggarwal, V Murahari
Year: 2025
Published in: ACM Computing ..., 2025 - dl.acm.org
Institution: University of Massachusetts Amherst, Carnegie Mellon University, Princeton University
Research Area: Reinforcement Learning from Human Feedback (RLHF), LLM, RLHF
Discipline: Artificial Intelligence
The paper critically analyzes reinforcement learning from human feedback (RLHF) for large language models (LLMs), emphasizing the importance and limitations of reward models in improving human-aligned AI systems.
Methods: Analyzed RLHF frameworks through reinforcement learning principles; conducted a categorical literature review to identify modeling challenges, assumptions, and framework limitations.
Key Findings: Investigated RLHF's fundamentals, focusing on the role of reward models, implications of design choices in RLHF training algorithms, and underlying issues like generalization errors, model misspecification, and feedback sparsity.
Citations: 117
-
Authors: F Salvi, M Horta Ribeiro, R Gallotti, R West
Year: 2025
Published in: Nature Human Behaviour, 2025 - nature.com
Institution: EPFL, Fondazione Bruno Kessle, Princeton University
Research Area: Conversational Persuasion of LLM, Human-Computer Interaction (HCI), Behavioral Science, LLM
Discipline: Behavioral Science
GPT-4 can use personalized arguments to be more persuasive in debates, outperforming humans in 64.4% of AI-human comparisons when personalization is applied.
Methods: Preregistered controlled study involving multiround debates with random assignment to conditions focusing on AI-human comparisons, personalization, and opinion strength.
Key Findings: Effectiveness of persuasion by GPT-4, especially when using personalized arguments, compared to humans in debates.
Citations: 65
Sample Size: 900
-
Authors: SSY Kim, JW Vaughan, QV Liao, T Lombrozo
Year: 2025
Published in: Proceedings of the ..., 2025 - dl.acm.org
Institution: Wake Forest University, University of Illinois at Urbana-Champaign, Princeton University, University of California Berkeley
Research Area: Appropriate Reliance on LLMs, Explainable AI, Human-AI Interaction, Cognitive Psychology
Discipline: Cognitive Psychology, Artificial Intelligence, Human-Computer Interaction (HCI)
The study examines factors that influence users' reliance on LLM responses, finding explanations increase reliance, while sources and inconsistent explanations reduce reliance on incorrect responses.
Methods: Think-aloud study followed by a pre-registered, controlled experiment to assess the impact of explanations, sources, and inconsistencies in LLM responses on user reliance.
Key Findings: Users' reliance on LLM responses, accuracy, and the influence of explanations, inconsistencies, and sources on these measures.
DOI: https://doi.org/10.1145/3706598.3714020
Citations: 38
Sample Size: 308
-
Authors: JQ Zhu, JC Peterson, B Enke, TL Griffiths
Year: 2025
Published in: Nature Human Behaviour, 2025 - nature.com
Institution: Princeton University, Boston University, Harvard University
Research Area: Strategic decision-making, Machine learning, Computational Cognitive Science
Discipline: Artificial Intelligence
This study used deep neural networks to analyze human strategic decision-making, predicting choices more accurately than existing theories and uncovering the context-dependent nature of reasoning and decision-making in complex games.
Methods: Deep neural networks trained on data from procedurally generated matrix games with over 2,400 variations; models were modified for interpretability.
Key Findings: Human choices and reasoning in initial play of two-player matrix games, focusing on strategic decision-making and response to game complexity.
DOI: https://doi.org/10.1038/s41562-025-02230-5
Citations: 16
Sample Size: 90000
-
Authors: S Zorowitz, J Solis, Y Niv, D Bennett
Year: 2023
Published in: Nature human behaviour, 2023 - nature.com
Institution: Princeton University, Rutgers University, Monash University
Research Area: Research Methodology, Behavioral Research, Experimental Psychology (focus on data quality and spurious correlations)
Discipline: Behavioral Science
DOI: https://doi.org/10.1038/s41562-023-01640-7
Citations: 110
-
Authors: V Kewenig, A Lampinen, SA Nastase
Year: 2023
Published in: arXiv preprint arXiv ..., 2023 - arxiv.org
Institution: University College London, Princeton University, Exeter University
Research Area: Computational Linguistics, Cognitive Science
Discipline: Computational Linguistics
DOI: https://doi.org/10.48550/arXiv.2308.06035
Citations: 3
-
Authors: K Vodrahalli, T Gerstenberg
Year: 2022
Published in: Advances in Neural ..., 2022 - proceedings.neurips.cc
Institution: Columbia University, Princeton University, Intel, Stanford University, Massachusetts Institute of Technology
Research Area: Human-AI Collaboration, Human Behavior Modeling, Decision Making
Discipline: Artificial Intelligence
Citations: 70