Ai: Research Area — Prolific Citations Library

Discover 193 peer-reviewed studies in Ai (2025–2026). Explore research findings powered by Prolific's diverse participant panel.

This page lists 193 peer-reviewed papers in the research area of Ai in the Prolific Citations Library, a curated collection of research powered by high-quality human data from Prolific.

Papers (20 of 193)

Bayesian teaching enables probabilistic reasoning in large language models

Authors: L Qiu, F Sha, K Allen, Y Kim, T Linzen, S van Steenkiste

Year: 2026

Published in: Nature …, 2026 - nature.com

Institution: Meta, Google DeepMind, Massachusetts Institute of Technology, Google Research, Google

Research Area: Probabilistic reasoning, Bayesian cognition, Neural language models, Reasoning, AI Evaluations

Discipline: Machine learning, Artificial intelligence

This paper sits at the intersection of machine learning and computational cognitive science, showing that large language models can acquire generalized probabilistic reasoning by being trained to imitate Bayesian belief updating rather than relying on prompting or heuristics.

Citations: 8
Moral Lenses, Political Coordinates: Towards Ideological Positioning of Morally Conditioned LLMs

Authors: C Yuan, B Ma, Z Zhang, B Prenkaj, F Kreuter, G Kasneci

Year: 2026

Published in: arXiv preprint arXiv:2601.08634, 2026•arxiv.org

Institution: Munich Center for Machine Learning, LMU Munich, Technical University of Munich

Research Area: Artificial Intelligence, AI Ethics, AI Alignment, Political Science, Computational Social Science

Discipline: Computer Science, Natural Language Processing (NLP)

This paper examines how large language models’ (LLMs) political outputs shift when you explicitly prime them with different moral values. Instead of just assigning fake personas (like “pretend to be liberal”), the authors condition models to endorse or reject specific moral values (e.g., utilitarianism, fairness, authority). They then measure how those moral primes move the models’ positions in...

DOI: https://doi.org/10.48550/arXiv.2601.08634
Once One Fails, All Are Suspect: Understanding Error Generalization in AI

Authors: L Dai, Z Wang, L Chen, J Jin

Year: 2026

Published in: 2026•scholarspace.manoa.hawaii.edu

Institution: Shanghai International Studies University

Research Area: Socio-Economic Impacts of AI, Algorithmic Systems

Discipline: Computer Science, Artificial Intelligence

AI errors lead to broader negative generalizations about other AI systems compared to human errors, largely due to perceptions of AI's inflexibility and inability to learn from mistakes.

Methods: Conducted four one-factor experiments across distinct contexts to compare human responses to AI errors and human errors.

Key Findings: Generalization of error perceptions from one AI system to others, and psychological mechanisms driving this process.
Self-image matters: Examining individual differences in resistance to loss framing messages

Authors: J He, C Calluso, C Donato, R Thouvarecq

Year: 2026

Published in: … - Journal of Retailing and …, 2026 - Elsevier

Institution: Luiss University, Roma Tre University, Univ Rouen Normandie, Le Mans Université

Research Area: Message framing, Psychological reactance, Self-image traits

Discipline: Consumer behavior

This paper looks at why some people get annoyed / push back (“psychological reactance”) when online grocery sites show healthy-eating PSAs, especially when the PSA is framed as a warning (“If you don’t eat well, you’ll suffer”) vs a benefit (“If you eat well, you’ll gain”).
The Artificial Intelligence Disclosure Penalty: Humans Persistently Devalue AI-Generated Creative Writing

Authors: M Raj, JM Berg, R Seamans

Year: 2026

Published in: Journal of Experimental Psychology …, 2026 - psycnet.apa.org

Institution: New York University, University of Michigan, Wharton

Research Area: Disclosure psychology, Biases in human–machine evaluation, AI Biases

Discipline: Experimental psychology

This paper sits at the intersection of experimental psychology, social cognition, and consumer judgment, examining how AI disclosure triggers persistent authenticity-based bias against creative work, revealing a robust form of algorithmic aversion in symbolic and expressive domains.

DOI: https://doi.org/10.1037/xge0001889
Unpacking Human Preference for LLMs: Demographically Aware Evaluation with the HUMAINE Framework

Authors: N Petrova, A Gordon, E Blindow

Year: 2026

Published in: Open review

Institution: Prolific

Research Area: Human-centered AI evaluation, Bayesian statistics, Responsible AI, AI alignment, LLM Evaluation

Discipline: Machine Learning, Artificial Intelligence

The study introduces HUMAINE, a multidimensional evaluation framework for LLMs, revealing demographic-specific preference variations and ranking google/gemini-2.5-pro as the top-performing model with a posterior probability of 95.6%.

Methods: Multi-turn naturalistic conversations analyzed using a hierarchical Bayesian Bradley-Terry-Davidson model with post-stratification to census data, stratified across 22 demographic groups.

Key Findings: Performance of 28 LLMs across five human-centric dimensions, accounting for demographic-specific preferences.

Sample Size: 23404
The placebo effect of artificial intelligence in Human-Computer Interaction (HCI)

Authors: T Kosch, R Welsch, L Chuang, A Schmidt

Year: 2025

Published in: ACM Transactions on ..., 2023 - dl.acm.org

Institution: Aalto University

Research Area: User Expectations, HCI Research Bias, Artificial Intelligence, AI Bias

Discipline: Human-Computer Interaction (HCI)

The belief in receiving adaptive AI support positively impacts user performance, demonstrating a placebo effect in Human-Computer Interaction.

Methods: Two experiments where participants completed word puzzles under conditions with or without supposed AI support; in reality, no AI assistance was provided.

Key Findings: Impact of perceived AI support on user expectations and task performance.

DOI: https://doi.org/10.1145/3529225

Citations: 149

Sample Size: 469
What large language models know and what people think they know

Authors: M Steyvers, H Tejeda, A Kumar, C Belem

Year: 2025

Published in: Nature Machine ..., 2025 - nature.com

Institution: University of California Irvine

Research Area: Computational Linguistics, Computational Social Science, AI Ethics, Trust in AI

Discipline: Computational Social Science

LLMs often lead to user overestimation of response accuracy, especially with longer explanations; adjusting explanation styles to align with model confidence improves calibration and discrimination gaps, enhancing trust in AI-assisted decision making.

Methods: Conducted experiments using multiple-choice and short-answer questions to study user confidence versus model-stated confidence; varied explanation length and alignment with model internal confidence.

Key Findings: Calibration gap (human vs. model confidence), discrimination gap (ability to distinguish correct vs. incorrect answers), and effects of explanation style and length on user trust.

Citations: 100
Human detection of political speech deepfakes across transcripts, audio, and video

Authors: M Groh, A Sankaranarayanan, N Singh, DY Kim

Year: 2025

Published in: Nature ..., 2024 - nature.com

Institution: Northwestern University, Massachusetts Institute of Technology

Research Area: Deepfakes, Media Forensics, Human Perception of AI-Generated Content, Political Communication

Discipline: Computational Social Science

Humans are better at detecting deepfake political speeches using audio-visual cues than relying on text alone; state-of-the-art text-to-speech audio makes deepfakes harder to discern.

Methods: Five pre-registered randomized experiments with varied base rates of misinformation, audio sources, question framings, and media modalities were conducted.

Key Findings: Human accuracy in discerning real political speeches from deepfakes across media formats and contextual variables.

DOI: https://doi.org/10.1038/s41467-024-51998-z

Citations: 63

Sample Size: 2215
Dimensions of diversity in human perceptions of algorithmic fairness

Authors: N Grgić-Hlača, G Lima, A Weller

Year: 2025

Published in: Proceedings of the 2nd ..., 2022 - dl.acm.org

Institution: Max Planck Institute, École Polytechnique Fédérale de Lausanne, University of Cambridge, The Alan Turing Institute

Research Area: Algorithmic Fairness, Human Perception, Diversity in AI Decision-Making

Discipline: Social Science, Artificial Intelligence

This study examines how sociodemographic factors and personal experience influence perceptions of fairness in algorithmic decision-making, particularly in bail decisions, highlighting the importance of diverse perspectives in regulatory oversight.

Methods: Explored perceptions of procedural fairness using surveys to assess the influence of demographics and personal experiences.

Key Findings: Impact of demographics (age, education, gender, race, political views) and personal experience on perceptions of fairness of algorithmic feature use in bail decisions.

DOI: 10.1145/3551624.3555306

Citations: 62
One-Minute Video Generation with Test-Time Training

Authors: K Dalal, D Koceja, G Hussein, J Xu, Y Zhao, Y Song, S Han, KC Cheung, J Kautz, C Guestrin, T Hashimoto, S Koyejo, Y Choi, Y Sun, X Wang

Year: 2025

Published in: ArXiv

Institution: Nvidia, Stanford University, UT Austin, University of California Berkeley, University of California San Diego

Research Area: Video Generation, Diffusion Models, Test-Time Training

Discipline: Computer Science

The paper introduces Test-Time Training (TTT) layers into Transformers to generate coherent one-minute videos from text storyboards, outperforming baselines in storytelling coherence but facing efficiency and artifact challenges.

Methods: Experimentation with Test-Time Training layers embedded in pre-trained Transformer models, evaluated using a dataset curated from Tom and Jerry cartoons and compared against Mamba 2, Gated DeltaNet, and sliding-window attention layers.

Key Findings: Effectiveness of video generation methods in creating coherent multi-scene stories in one-minute videos.

Citations: 52

Sample Size: 100
Fostering appropriate reliance on large language models: The role of explanations, sources, and inconsistencies

Authors: SSY Kim, JW Vaughan, QV Liao, T Lombrozo

Year: 2025

Published in: Proceedings of the ..., 2025 - dl.acm.org

Institution: Wake Forest University, University of Illinois at Urbana-Champaign, Princeton University, University of California Berkeley

Research Area: Appropriate Reliance on LLMs, Explainable AI, Human-AI Interaction, Cognitive Psychology

Discipline: Cognitive Psychology, Artificial Intelligence, Human-Computer Interaction (HCI)

The study examines factors that influence users' reliance on LLM responses, finding explanations increase reliance, while sources and inconsistent explanations reduce reliance on incorrect responses.

Methods: Think-aloud study followed by a pre-registered, controlled experiment to assess the impact of explanations, sources, and inconsistencies in LLM responses on user reliance.

Key Findings: Users' reliance on LLM responses, accuracy, and the influence of explanations, inconsistencies, and sources on these measures.

DOI: https://doi.org/10.1145/3706598.3714020

Citations: 38

Sample Size: 308
When AI-Based Agents Are Proactive: Implications for Competence and System Satisfaction in Human-AI Collaboration

Authors: C Diebel, M Goutier, M Adam, A Benlian

Year: 2025

Published in: Business & Information Systems ..., 2025 - Springer

Institution: Technical University of Darmstadt, University of Goettingen

Research Area: Human-AI Collaboration, System Satisfaction, User Competence

Discipline: Information Systems, Human-Computer Interaction (HCI), Artificial Intelligence

Proactive AI-based agent assistance decreases users' competence-based self-esteem and system satisfaction, especially for users with higher AI knowledge.

Methods: Vignette-based online experiment using self-determination theory as the framework to evaluate user responses to proactive vs. reactive AI assistance.

Key Findings: Impact of proactive vs. reactive AI help on users' competence-based self-esteem and system satisfaction, moderated by users' AI knowledge levels.

DOI: https://doi.org/10.1007/s12599-024-00918-y

Citations: 32
Can large language models assess personality from asynchronous video interviews? A comprehensive evaluation of validity, reliability, fairness, and rating patterns

Authors: T Zhang, A Koutsoumpis, JK Oostrom

Year: 2025

Published in: IEEE Transactions ..., 2024 - ieeexplore.ieee.org

Institution: Southeast University, Vrije Universiteit, Tilburg University

Research Area: LLM Personality Assessment, Human-AI Interaction, LLM

Discipline: Human-AI Interaction, Social Science, Humanities

LLMs like GPT-3.5 and GPT-4 can rival or outperform task-specific AI models in assessing personality traits from asynchronous video interviews, but show uneven performance, low reliability, and potential biases, warranting cautious use in high-stakes scenarios.

Methods: The study evaluated GPT-3.5 and GPT-4 performance in assessing personality traits and interview performance using simulated AVI responses, comparing them with ratings from task-specific AI and human annotators.

Key Findings: Validity, reliability, fairness, and rating patterns of LLMs (GPT-3.5 and GPT-4) in personality assessment from asynchronous video interviews.

Citations: 31

Sample Size: 685
The challenges of providing explanations of AI systems when they do not behave like users expect

Authors: M Riveiro, S Thill

Year: 2025

Published in: Proceedings of the 30th ACM Conference on User ..., 2022 - dl.acm.org

Institution: Linköping University, University of Skövde

Research Area: Explainable AI, Human-Computer Interaction (HCI)

Discipline: Human-Computer Interaction (HCI)

Users prefer factual explanations when AI outputs match expectations and mechanistic explanations when outputs deviate, with preferences influenced by response format (multiple-choice vs free text).

Methods: Participants were presented with scenarios involving an automated text classifier and asked to express their preference for explanations either through multiple-choice or free text responses.

Key Findings: User-desired content of AI explanations based on whether system behaviour aligns or deviates from expectations.

DOI: 10.1145/3503252.3531306

Citations: 30
Generative AI meets open-ended survey responses: Research participant use of ai and homogenization

Authors: S Zhang, J Xu, AJ Alvero

Year: 2025

Published in: Sociological Methods & Research, 2025 - journals.sagepub.com

Institution: University of Maryland, Indiana University, University of Minnesota Duluth

Research Area: Sociological Methods, Generative AI, Survey Methodology

Discipline: Sociology, Social Science

The study finds that 34% of research participants use generative AI tools like large language models (LLMs) to assist with open-ended survey responses, leading to more homogeneity and positivity in their answers, which could impact data validity by masking social variations.

Methods: The study conducted an original survey on a popular online platform and simulated comparisons between human-written responses from pre-ChatGPT studies and LLM-generated responses.

Key Findings: Use of LLMs by survey participants, differences in text homogeneity, positivity, and masking of social variation in open-ended survey responses.

Citations: 26
Multi-turn Evaluation of Anthropomorphic Behaviours in Large Language Models

Authors: L Ibrahim, C Akbulut, R Elasmar, C Rastogi, M Kahng, MR Morris, KR McKee, V Rieser, M Shanahan, L Weidinger

Year: 2025

Published in: arXiv preprint arXiv:2502.07077, 2025•arxiv.org

Institution: Google DeepMind, Google, University of Oxford

Research Area: Multimodal conversational AI, conversational AI, Evaluation methodology, benchmarking

Discipline: Computer Science, Natural Language Processing (NLP), Human–Computer Interaction (HCI)

The paper evaluates anthropomorphic behaviors in SOTA LLMs through a multi-turn methodology, showing that such behaviors, including empathy and relationship-building, predominantly emerge after multiple interactions and influence user perceptions.

Methods: Multi-turn evaluation of 14 anthropomorphic behaviors using simulations of user interactions, validated by a large-scale human subject study.

Key Findings: Anthropomorphic behaviors in large language models, including relationship-building and pronoun usage, and their perception by users.

Citations: 26

Sample Size: 1101
To rely or not to rely? evaluating interventions for appropriate reliance on large language models

Authors: JY Bo, S Wan, A Anderson

Year: 2025

Published in: Proceedings of the 2025 CHI Conference ..., 2025 - dl.acm.org

Institution: University of Toronto

Research Area: Appropriate reliance on LLM, Human-Computer Interaction (HCI), AI-assisted decision making.

Discipline: Human-Computer Interaction (HCI)

This paper explores the latest advancements and key trends in the field of Human-Computer Interaction (HCI), focusing on novel interfaces and user experience paradigms.

Citations: 25
How do people react to political bias in generative Artificial Intelligence?

Authors: U Messer

Year: 2025

Published in: Computers in Human Behavior: Artificial Humans, 2025 - Elsevier

Institution: Universität der Bundeswehr München

Research Area: Political Bias in Generative AI, Human-AI Interaction, Affective Computing, AI Bias

Discipline: Computer Science, Human-AI Interaction

People's acceptance and reliance on Generative AI (GAI) increase when they perceive alignment between their political orientation and the bias of GAI-generated content, leading to expanded trust in sensitive applications.

Methods: Three experiments analyzing behavioral reactions to politically biased content generated by GAI, including the impact of perceived alignment on acceptance and trust.

Key Findings: Participants' acceptance, reliance, and trust in GAI based on perceived alignment between political bias of GAI-generated content and their own political beliefs.

DOI: https://doi.org/10.1016/j.chbah.2024.100108

Citations: 24

Sample Size: 513
People Overtrust AI-Generated Medical Advice despite Low Accuracy

Authors: S Shekar, P Pataranutaporn, C Sarabu, GA Cecchi

Year: 2025

Published in: NEJM AI, 2025 - ai.nejm.org

Institution: MIT Media Lab, IBM Research, Stanford University, Massachusetts Institute of Technology

Research Area: AI Ethics, Healthcare, Patient Trust, Medical Misinformation

Discipline: Artificial Intelligence, Human-Computer Interaction (HCI), AI Ethics

This paper discusses a study by MIT researchers detailing patient trust in AI-generated medical advice, even when that advice is incorrect, raising concerns about misinformation in healthcare.

Citations: 19