Cambridge: Research Institution — Prolific Citations Library

Browse 19 peer-reviewed papers from Cambridge spanning LLM, Computational Social Science (2020–2025). Research powered by Prolific's high-quality participant data.

This page lists 19 peer-reviewed papers from researchers at Cambridge in the Prolific Citations Library, a curated collection of research powered by high-quality human data from Prolific.

Papers (19 of 19)

Dimensions of diversity in human perceptions of algorithmic fairness

Authors: N Grgić-Hlača, G Lima, A Weller

Year: 2025

Published in: Proceedings of the 2nd ..., 2022 - dl.acm.org

Institution: Max Planck Institute, École Polytechnique Fédérale de Lausanne, University of Cambridge, The Alan Turing Institute

Research Area: Algorithmic Fairness, Human Perception, Diversity in AI Decision-Making

Discipline: Social Science, Artificial Intelligence

This study examines how sociodemographic factors and personal experience influence perceptions of fairness in algorithmic decision-making, particularly in bail decisions, highlighting the importance of diverse perspectives in regulatory oversight.

Methods: Explored perceptions of procedural fairness using surveys to assess the influence of demographics and personal experiences.

Key Findings: Impact of demographics (age, education, gender, race, political views) and personal experience on perceptions of fairness of algorithmic feature use in bail decisions.

DOI: 10.1145/3551624.3555306

Citations: 62
Scaling language model size yields diminishing returns for single-message political persuasion

Authors: K Hackenburg, BM Tappin, P Röttger, SA Hale

Year: 2025

Published in: Proceedings of the ..., 2025 - pnas.org

Institution: University of California Berkeley, University of Cambridge, University of Oxford, Max Planck Institute

Research Area: Political Persuasion, LLM

Discipline: Computational Social Science, Political Science

Scaling language model sizes leads to diminishing returns in generating persuasive political messages, with larger models providing minimal gains compared to smaller ones after controlling for task completion metrics like coherence and relevance.

Methods: Generated 720 political messages using 24 LLMs of varying sizes and tested their persuasiveness through a large-scale randomized survey experiment.

Key Findings: Persuasive capability of language models across different sizes in generating political messages.

Citations: 31

Sample Size: 25982
Large Language Models Are More Persuasive Than Incentivized Human Persuaders

Authors: P. Schoenegger, F. Salvi, J. Liu, X. Nan, R. Debnath, B. Fasolo, E. Leivada, G. Recchia, F. Günther, A. Zarifhonarvar, J. Kwon, Z. Ul Islam, M. Dehnert, D. Y. H. Lee, M. G. Reinecke, D. G. Kamper, M. Kobaş, A. Sandford, J. Kgomo, L. Hewitt, S. Kapoor, K. Oktar, E. E. Kucuk, B. Feng, C. R. Jones, I. Gainsburg, S. Olschewski, N. Heinzelmann, F. Cruz, B. M. Tappin, T. Ma, P. S. Park, R. Onyonka, A. Hjorth, P. Slattery, Q. Zeng, L. Finke, I. Grossmann, A. Salatiello, E. Karger

Year: 2025

Published in: arXiv preprint arXiv ..., 2025 - arxiv.org

Institution: London School of Economics and Political Science, University of Cambridge, University College London, Massachusetts Institute of Technology, University of Oxford, Modulo Research, Stanford University, Federal Reserve Bank of Chicago, ETH Zürich, University of Johannesburg

Research Area: Computation and Language

Discipline: Social Science, Artificial Intelligence

This paper compares a frontier LLM (Claude Sonnet 3.5) against incentivized human persuaders in a conversational quiz setting, finding that the AI's persuasion capabilities surpass those of humans with real-money bonuses tied to performance.

Citations: 16
iNews: A multimodal dataset for modeling personalized affective responses to news

Authors: T Hu, N Collier

Year: 2025

Published in: arXiv preprint arXiv:2503.03335, 2025 - arxiv.org

Institution: University of Cambridge

Research Area: Affective Computing, Natural Language Processing, Computational Social Science

Discipline: Computational Social Science

The iNews dataset is a multimodal resource for studying personalized affective responses to news, improving modeling accuracy by incorporating annotator persona metadata.

Methods: 292 demographically diverse UK participants annotated 2,899 Facebook news posts with multidimensional labels (e.g., emotions, valence, arousal), combined with comprehensive participant persona data.

Key Findings: Modeled personalized affective responses to news through annotations capturing valence, arousal, emotions, and persona metadata.

Citations: 2

Sample Size: 2899
Benchmarking World-Model Learning

Authors: A Warrier, D Nguyen, M Naim, M Jain, Y Liang, K Schroeder, C Yang, JB Tenenbaum, S Vollmer, K Ellis, Z Tavares

Year: 2025

Published in: 2025 - arXiv preprint arXiv …, 2025 - arxiv.org

Institution: Basis Research Institute, DFKI GmbH, Harvard University, Quebec AI Institute, University of Cambridge, Massachusetts Institute of Technology, Cornell University

Research Area: Agent learning, World Models, Benchmarking, Evaluation protocols, RLHF, LLM

Discipline: Computer Science, Artificial Intelligence, Machine Learning

The paper introduces WorldTest, a novel protocol for evaluating model-learning agents using reward-free exploration and behavior-based scoring, and demonstrates that humans outperform models on the AutumnBench suite of tasks, revealing significant gaps in world-model learning.

Methods: The authors proposed WorldTest, a protocol separating reward-free interaction from scored tests in related environments, with evaluations done using AutumnBench—a dataset of 43 grid-world environments and 129 tasks across prediction, planning, and causal dynamics.

Key Findings: Performance of model-learning agents and humans in acquiring world models for masked-frame prediction, planning, and understanding causal dynamics.

Citations: 1

Sample Size: 517
Beyond Face Value: Visual and Auditory Signals in Human and Machine Trust Judgments

Authors: N Tyulina, Y Yu, TA Emmanouil, SI Levitan

Year: 2025

Published in: Proceedings of the 7th ACM ..., 2025 - dl.acm.org

Institution: University of Cambridge, University of Bath, University of Edinburgh, New York University

Research Area: Human-AI Interaction, Trust and Perception, Nonverbal Communication

Discipline: Applied Linguistics

Trust judgments are primarily influenced by auditory cues in both humans and multimodal models, though subtle differences in modality weighting exist between them.

Methods: Behavioral experiment with trust ratings of bimodal stimuli across four trust congruence conditions, combined with a multimodal model trained using HuBERT and ResNet-50 with late fusion, analyzed using Permutation Feature Importance (PFI).

Key Findings: The construction of trust from visual and auditory signals in both humans and multimodal models, focusing on modality dominance and feature weighting.

Sample Size: 150
Influence of believed AI involvement on the perception of digital medical advice

Authors: M Reis, F Reis, W Kunde

Year: 2024

Published in: Nature Medicine, 2024 - nature.com

Institution: University of Cambridge, Julius Maximilians Universität

Research Area: AI in Healthcare, Medical Ethics, Cognitive Psychology, Human-Computer Interaction (HCI) in Medicine

Discipline: AI in Healthcare, Medical Ethics, Cognitive Psychology

The study found that medical advice labeled as being sourced from AI (or AI supervised by humans) is perceived as less reliable and empathetic compared to advice labeled as originating solely from a human physician, resulting in reduced willingness to follow such advice.

Methods: Two preregistered studies were conducted where participants were presented with identical medical advice scenarios but with manipulated labels for the advice source ('AI', 'human physician', 'human+AI').

Key Findings: Participants' perceptions of reliability, empathy, and willingness to follow medical advice based on the perceived source.

Citations: 78

Sample Size: 2280
Large language models must be taught to know what they don't know

Authors: S Kapoor, N Gruver, M Roberts

Year: 2024

Published in: Advances in ..., 2024 - proceedings.neurips.cc

Institution: Abacus AI, University of Cambridge, New York University, Columbia University

Research Area: Uncertainty Estimation, LLM Limitations, Know-What-You-Don't-Know, Computational Cognition

Discipline: Artificial Intelligence

Fine-tuning large language models (LLMs) on a small dataset of graded examples improves uncertainty estimations, enhancing their applicability in high-stakes scenarios and human-AI collaboration.

Methods: The researchers fine-tuned LLMs using a small dataset of graded correct and incorrect answers with LoRA (Low-Rank Adaptation) to create uncertainty estimates and conducted a user study to investigate their utility in human-AI collaboration.

Key Findings: Calibration and generalization of uncertainty estimates, performance of fine-tuning LLMs for uncertainty estimation, and human-AI interaction improvements informed by uncertainty data.

Citations: 71

Sample Size: 1000
Designing optimal behavioral experiments using machine learning

Authors: S Valentin, S Kleinegesse, NR Bramley, P Seriès

Year: 2024

Published in: Elife, 2024 - elifesciences.org

Institution: University of Edinburgh, University of Cambridge

Research Area: Bayesian Optimal Experimental Design (BOED) in Behavioral Research

Discipline: Artificial Intelligence, Psychology

The paper presents a tutorial on using Bayesian optimal experimental design (BOED) and machine learning to design experiments that efficiently test and evaluate cognitive models, validated via simulations and a real-world case study of exploration-exploitation decision-making.

Methods: The paper employs Bayesian optimal experimental design (BOED) coupled with machine learning to identify optimal experimental configurations. Simulations and a real-world multi-armed bandit experiment are used for validation.

Key Findings: The capacity of BOED to distinguish between cognitive models, parameters explaining human behavior, and how people balance exploration and exploitation.

DOI: https://doi.org/10.7554/eLife.86224

Citations: 15
Surveys considered harmful? Reflecting on the use of surveys in AI research, development, and governance

Authors: M Tahaei, D Wilkinson, A Frik, M Muller

Year: 2024

Published in: Proceedings of the ..., 2024 - ojs.aaai.org

Institution: University of Cambridge, University of Bath, University of Amsterdam, Amazon

Research Area: AI Ethics, Survey Methods, AI Governance

Discipline: AI Ethics, Governance

DOI: https://doi.org/10.1609/aies.v7i1.31734

Citations: 11
Crowdsourcing Academic Online Research on Prolific

Authors: Eyal Peer

Year: 2024

Published in: CAMBRIDGE

Institution: Hebrew University, University of Cambridge

Research Area: Crowdsourcing, Research Methodology in Behavioral and Social Sciences

Discipline: Social, Behavioral Sciences

Citations: 7
Auditing multimodal large language models for context-aware content moderation

Authors: T Davidson

Year: 2024

Published in: 2024 - files.osf.io

Institution: University of Cambridge

Research Area: Content Moderation, Multimodal LLM Auditing, Computational Social Science

Discipline: Computational Social Science

Citations: 2
Can Large Language Models Understand Symbolic Graphics Programs?

Authors: Z Qiu, W Liu, H Feng, Z Liu, T Xiao

Year: 2024

Published in: ArXiv

Institution: Massachusetts Institute of Technology, Max Planck Institute, University of Cambridge

Research Area: Computational cognition, LLM evaluation, Program synthesis, Multimodal reasoning

Discipline: Artificial Intelligence

Introduces SGP-Bench, a benchmark testing whether LLMs can answer semantic and spatial questions about images purely from graphics programs (SVG/CAD), effectively probing “visual imagination without vision.” The authors show current LLMs struggle - sometimes near chance - even when images are trivial for humans, but demonstrate that Symbolic Instruction Tuning (SIT) can meaningfully improve thi...
Larger and more instructable language models become less reliable

Authors: Lexin Zhou, Wout Schellaert, Fernando Martínez-Plumed, Yael Moros-Daval, Cèsar Ferri & José Hernández-Orallo

Year: 2024

Published in: Nature

Institution: Universitat Politècnica de València, University of Cambridge, ValGRAI

Research Area: LLM reliability and evaluation, competency assessment

Discipline: Artificial Intelligence, Behavioral Science
Source-credibility information and social norms improve truth discernment and reduce engagement with misinformation online

Authors: T Prike, LH Butler, UKH Ecker

Year: 2023

Published in: Scientific Reports, 2024 - nature.com

Institution: University of Western Australia, University of Exeter, University of Cambridge

Research Area: Social Science, Misinformation, Human Behavior, Media Studies

Discipline: Social Science

DOI: https://doi.org/10.1038/s41598-024-57560-7

Citations: 45
Leveraging Human Feedback to Scale Educational Datasets: Combining Crowdworkers and Comparative Judgement

Authors: O Henkel, L Hills

Year: 2023

Published in: Proceedings of the Tenth ACM Conference on ..., 2023 - dl.acm.org

Institution: University of Cambridge, University of Bath

Research Area: Crowdsourcing, Comparative Judgement, Educational Datasets, Human Feedback

Discipline: Computer Science

DOI: 10.1145/3573051.3596198

Citations: 2
Turking in the time of COVID

Authors: AA Arechar, DG Rand

Year: 2021

Published in: Behavior research methods, 2021 - Springer

Institution: Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA

Research Area: Online Labor Markets, Amazon Mechanical Turk (MTurk), Social Science Research during COVID-19

Discipline: Behavioral Research

Citations: 154
The experimenters' dilemma: inferential preferences over populations

Authors: N Gupta, L Rigotti, A Wilson

Year: 2021

Published in: arXiv preprint arXiv:2107.05064, 2021 - arxiv.org

Institution: University of Cambridge, University of Verona, University of Oxford, University of Pittsburgh

Research Area: Experimental Design, Research Methodology, Inferential Statistics

Discipline: Social Science Research Methods

Citations: 104
Which artificial intelligences do people care about most? A conjoint experiment on moral consideration

Authors: A Ladak, J Harris, JR Anthis

Year: 2020

Published in: Proceedings of the 2024 CHI Conference ..., 2024 - dl.acm.org

Institution: University of Cambridge, University of Bath, University of Edinburgh

Research Area: Moral consideration of AI, Conjoint Experiment, Human-Computer Interaction (HCI), Psychology

Discipline: Human-Computer Interaction (HCI)

DOI: 10.1145/3613904.3642403

Citations: 16