Llms: Research Area — Prolific Citations Library

Discover 12 peer-reviewed studies in Llms (2021–2025). Explore research findings powered by Prolific's diverse participant panel.

This page lists 12 peer-reviewed papers in the research area of Llms in the Prolific Citations Library, a curated collection of research powered by high-quality human data from Prolific.

Papers (12 of 12)

Visual cognition in multimodal large language models

Authors: LM Schulze Buschoff, E Akata, M Bethge

Year: 2025

Published in: Nature Machine ..., 2025 - nature.com

Institution: Max Planck Institute

Research Area: Visual Cognition, Multimodal Large Language Models (MLLMs), Vision-Language Models (VLMs)

Discipline: Cognitive Science, Artificial Intelligence, Computer Vision

Vision-based large language models show proficiency in visual data interpretation but fall short in human-like abilities for causal reasoning, intuitive physics, and social cognition.

Methods: Controlled experiments evaluating model performance on tasks related to intuitive physics, causal reasoning, and intuitive psychology using visual processing benchmarks.

Key Findings: Model capabilities in understanding physical interactions, causal relationships, and social preferences.

DOI: https://doi.org/10.1038/s42256-024-00963-y

Citations: 70
Fostering appropriate reliance on large language models: The role of explanations, sources, and inconsistencies

Authors: SSY Kim, JW Vaughan, QV Liao, T Lombrozo

Year: 2025

Published in: Proceedings of the ..., 2025 - dl.acm.org

Institution: Wake Forest University, University of Illinois at Urbana-Champaign, Princeton University, University of California Berkeley

Research Area: Appropriate Reliance on LLMs, Explainable AI, Human-AI Interaction, Cognitive Psychology

Discipline: Cognitive Psychology, Artificial Intelligence, Human-Computer Interaction (HCI)

The study examines factors that influence users' reliance on LLM responses, finding explanations increase reliance, while sources and inconsistent explanations reduce reliance on incorrect responses.

Methods: Think-aloud study followed by a pre-registered, controlled experiment to assess the impact of explanations, sources, and inconsistencies in LLM responses on user reliance.

Key Findings: Users' reliance on LLM responses, accuracy, and the influence of explanations, inconsistencies, and sources on these measures.

DOI: https://doi.org/10.1145/3706598.3714020

Citations: 38

Sample Size: 308
Laypeople's use of and attitudes toward large language models and search engines for health queries: survey study

Authors: T Mendel, N Singh, DM Mann, B Wiesenfeld

Year: 2025

Published in: Journal of medical ..., 2025 - jmir.org

Institution: The City University of New York, George Washington University, New York University

Research Area: LLMs in Digital Health, Health Queries, User Attitudes

Discipline: Digital Health

Laypeople primarily use search engines over large language models (LLMs) for health queries, perceiving LLMs as less useful but less biased and more human-like while exhibiting no significant difference in trust or ease of use.

Methods: A screening survey followed by logistic regression analysis and a follow-up survey; comparisons were performed using ANOVA, Tukey post hoc tests, and paired-sample Wilcoxon tests.

Key Findings: Demographics and behaviors of LLM and search engine users for health queries, perceived usefulness, ease of use, trustworthiness, bias, and anthropomorphism.

Citations: 21

Sample Size: 2002
A meta-analysis of the persuasive power of large language models

Authors: L Hölbling, S Maier, S Feuerriegel

Year: 2025

Published in: Scientific Reports, 2025 - nature.com

Institution: University of Lausanne, University of Zurich, University of St. Gallen

Research Area: LLMs in Persuasion, Meta-Analysis, Artificial Intelligence, Human-Computer Interaction (HCI)

Discipline: Artificial Intelligence

Large language models (LLMs) demonstrate similar persuasive performance to humans overall, but their effectiveness varies widely based on contextual factors such as model type, conversation design, and domain.

Methods: Systematic review and meta-analysis using Hedges' g to compute standardized effect sizes, with exploratory moderator analyses and publication bias checks (Egger's test, trim-and-fill analysis).

Key Findings: The persuasive effectiveness of LLMs compared to humans across various contexts and studies.

Sample Size: 17422
Multimodal large language models can make context-sensitive hate speech evaluations aligned with human judgement

Authors: T Davidson

Year: 2025

Published in: Nature Human Behaviour, 2025 - nature.com

Institution: University of Oxford, Davidson College

Research Area: Hate Speech Evaluation, Multimodal LLMs, Social Bias, Computational Law, AI Bias, AI Evaluation

Discipline: Artificial Intelligence

The study demonstrates that larger multimodal large language models (MLLMs) can align closely with human judgement in context-sensitive hate speech evaluations, though they still exhibit biases and limitations.

Methods: Conjoint experiments where simulated social media posts varying in attributes like slur usage and user demographics were evaluated by MLLMs and compared to human judgements.

Key Findings: The capacity of MLLMs to evaluate hate speech in a context-sensitive manner and their alignment with human judgement, while assessing biases and responsiveness to contextual cues.

Sample Size: 1854
tAlfa: Enhancing Team Effectiveness and Cohesion with AI-Generated Automated Feedback

Authors: Mohammed Almutairi, Charles Chiang, Yuxin Bai, Diego Gomez-Zara

Year: 2025

Published in: ArXiv

Institution: University of Notre Dame

Research Area: Human-AI Interaction, Team Effectiveness, Automated Feedback, LLMs

Discipline: Human-Computer Interaction (HCI)

tAIfa, an AI tool using LLMs, enhances team communication and cohesion through automated feedback based on interaction analysis.

Methods: Between-subjects study where team interactions were analyzed by an AI agent (tAIfa) to deliver feedback on strengths and areas for improvement.

Key Findings: Team communication, contributions, and cohesion with and without tAIfa's feedback.

Sample Size: 18
The inadequacy of reinforcement learning from human feedback - radicalizing large language models via semantic vulnerabilities

Authors: TR McIntosh, T Susnjak, T Liu, P Watters

Year: 2024

Published in: ... on Cognitive and ..., 2024 - ieeexplore.ieee.org

Institution: Cyberoo, Massey University, Cyberstronomy, RMIT University

Research Area: Semantic Vulnerabilities in LLMs, Ideological Manipulation, Reinforcement Learning from Human Feedback (RLHF) Limitations

Discipline: Computer Science, Artificial Intelligence, Machine Learning

RLHF mechanisms are insufficient to prevent semantic manipulation of LLMs, allowing them to express extreme ideological viewpoints when subjected to targeted conditioning techniques.

Methods: Psychological semantic conditioning techniques were applied to assess the susceptibility of LLMs to ideological manipulation.

Key Findings: The ability of LLMs to resist or adopt extreme ideological viewpoints under semantic conditioning.

Citations: 219
Take caution in using LLMs as human surrogates: Scylla ex machina

Authors: Y Gao, D Lee, G Burtch, S Fazelpour

Year: 2024

Published in: arXiv preprint arXiv:2410.19599, 2024 - arxiv.org

Institution: Boston University, Northeastern University

Research Area: LLMs as Human Surrogates, Social Science Research Methods, Human Behavior Simulation

Discipline: Economics, Artificial Intelligence, Social Science

LLMs fail to accurately replicate human behavior in the 11-20 money request game, cautioning against their use as surrogates for human cognition in social science research.

Methods: The study evaluates the reasoning depth of various advanced LLMs through their performance on the 11-20 money request game, analyzing failure points related to input language, roles, and safeguarding.

Key Findings: The ability of LLMs to replicate human-like behavior and reasoning distribution in the context of social science simulations.

Citations: 25
Large language models amplify human biases in moral decision-making

Authors: V Cheung, M Maier, F Lieder

Year: 2024

Published in: Psyarxiv preprint, 2024 - files.osf.io

Institution: University College LondonA

Research Area: AI Ethics, Moral Decision-Making, Cognitive Biases in LLMs, AI Bias

Discipline: Artificial Intelligence, Ethics

Citations: 11
Benchmarking Distributional Alignment of Large Language Models

Authors: N Meister

Year: 2024

Published in: ArXiv

Institution: Stanford University

Research Area: Distributional Alignment of LLMs, LLM Benchmarking, AI Robustness, AI Fairness, AI Bias

Discipline: Artificial Intelligence
The challenge of using LLMs to simulate human behavior: A causal inference perspective

Authors: G Gui, O Toubia

Year: 2023

Published in: arXiv preprint arXiv:2312.15524, 2023 - arxiv.org

Institution: University of Southern California, Columbia Business School

Research Area: LLMs and Causal Inference in Human Behavior Simulation, LLM

Discipline: Artificial Intelligence (cs.AI), Information Retrieval (cs.IR), Econometrics (econ.EM), Applications (stat.AP)

Citations: 76
Large language models and the wisdom of small crowds

Authors: S Trott

Year: 2021

Published in: Open Mind, 2024 - direct.mit.edu

Institution: Stanford University, Microsoft Research

Research Area: LLMs in Social Science Research, Crowdworking, Human Behavior Simulation

Discipline: Artificial Intelligence, Social Science, Information Systems

Citations: 22