Large Language Models Are More Persuasive Than Incentivized Human Persuaders
Abstract
We directly compare the persuasion capabilities of a frontier large language model (LLM; Claude Sonnet 3.5) against incentivized human persuaders in an interactive, real-time conversational quiz setting. In this preregistered, large-scale incentivized experiment, participants (quiz takers) completed an online quiz where persuaders (either humans or LLMs) attempted to persuade quiz takers toward correct or incorrect answers. We find that LLM persuaders achieved significantly higher compliance with their directional persuasion attempts than incentivized human persuaders, demonstrating superior persuasive capabilities in both truthful (toward correct answers) and deceptive (toward incorrect answers) contexts. We also find that LLM persuaders significantly increased quiz takers' accuracy, leading to higher earnings, when steering quiz takers toward correct answers, and significantly decreased their accuracy, leading to lower earnings, when steering them toward incorrect answers. Overall, our findings suggest that AI's persuasion capabilities already exceed those of humans that have real-money bonuses tied to performance. Our findings of increasingly capable AI persuaders thus underscore the urgency of emerging alignment and governance frameworks.
Study specs
- Authors
- P. Schoenegger,F. Salvi,J. Liu,X. Nan,R. Debnath,B. Fasolo,E. Leivada,G. Recchia,F. Günther,A. Zarifhonarvar,J. Kwon,Z. Ul Islam,M. Dehnert,D. Y. H. Lee,M. G. Reinecke,D. G. Kamper,M. Kobaş,A. Sandford,J. Kgomo,L. Hewitt,S. Kapoor,K. Oktar,E. E. Kucuk,B. Feng,C. R. Jones,I. Gainsburg,S. Olschewski,N. Heinzelmann,F. Cruz,B. M. Tappin,T. Ma,P. S. Park,R. Onyonka,A. Hjorth,P. Slattery,Q. Zeng,L. Finke,I. Grossmann,A. Salatiello,E. Karger
- Institution
- London School of Economics and Political Science,University of Cambridge,University College London,Massachusetts Institute of Technology,University of Oxford,Modulo Research,Stanford University,Federal Reserve Bank of Chicago,ETH Zürich,University of Johannesburg
- Discipline
- Social Science,Artificial Intelligence
- Year
- 2025
- Human Data Platform
- Prolific
- Source
- View Source Google Scholar
Peer Review & Critical Discussion
Potential Selection Bias in 2023 Cohort
The participant pool shows a concerning overrepresentation of users from high-income demographics. Looking at Table 3, we can see that 78% of respondents had annual incomes above $75k, which significantly limits the generalizability of these findings to broader populations.
Non-naive Participants Issue
I've noticed a methodological concern regarding participant naivety. Given that Prolific users often complete multiple studies, there's a real risk that participants had prior exposure to similar experimental paradigms, which could confound the results.
RLHF Applicability to This Study Design
The implications for RLHF training pipelines are understated. If we accept the authors' conclusions about preference stability, this has direct consequences for how we should structure reward model training. The temporal decay effect described in Section 4.2 is particularly relevant.
Verify your expertise to join discussion
Create an account and verify your credentials to participate in peer discussions.