F Chiu: Researcher — Prolific Citations Library

Explore 1 peer-reviewed study by F Chiu in Reinforcement Learning from Human Feedback (RLHF) and Human-AI Interaction (2025). Discover research powered by Prolific's participant panel.

This page lists 1 peer-reviewed paper authored or co-authored by F Chiu in the Prolific Citations Library, a curated collection of research powered by high-quality human data from Prolific.

Papers (1 of 1)

A Descriptive and Normative Theory of Human Beliefs in RLHF

Authors: S Dandekar, S Deshmukh, F Chiu, WB Knox

Year: 2025

Published in: arXiv preprint arXiv ..., 2025 - arxiv.org

Institution: University of California, Davis, Northwestern University

Research Area: Reinforcement Learning from Human Feedback (RLHF), Human-AI Interaction, AI Theory

Discipline: Artificial Intelligence, Social Science

The paper investigates how human beliefs about agent capabilities influence preferences in RLHF, proposing a model to minimize the mismatch between beliefs and idealized agent capabilities, ultimately improving policy performance.

Methods: Human studies and synthetic experiments to model and test the impact of belief mismatches on human preferences and RLHF effectiveness.

Key Findings: Effects of human beliefs about agent capabilities on their provided preferences and the performance of RLHF policies.

DOI: https://doi.org/10.48550/arXiv.2506.01692