Papers by Z

Explore 40 peer-reviewed studies by Z in Human-Computer Interaction and Computational Social Science (2025–2026). Discover research powered by Prolific's participant panel.

This page lists 40 peer-reviewed papers authored or co-authored by Z in the Prolific Citations Library, a curated collection of research powered by high-quality human data from Prolific.

Papers (20 of 40)

Bayesian teaching enables probabilistic reasoning in large language models

Authors: L Qiu, F Sha, K Allen, Y Kim, T Linzen, S van Steenkiste

Year: 2026

Published in: Nature …, 2026 - nature.com

Institution: Meta, Google DeepMind, Massachusetts Institute of Technology, Google Research, Google

Research Area: Probabilistic reasoning, Bayesian cognition, Neural language models, Reasoning, AI Evaluations

Discipline: Machine learning, Artificial Intelligence

This paper sits at the intersection of machine learning and computational cognitive science, showing that large language models can acquire generalized probabilistic reasoning by being trained to imitate Bayesian belief updating rather than relying on prompting or heuristics.

Citations: 8
How does leader secure-base support affect hospitality employees’ service performance? Role of work engagement and role stressors

Authors: H Zhu, J Chen, N Liu

Year: 2026

Published in: International Journal of Hospitality Management, 2026 - Elsevier

Institution: Sun Yat-Sen University

Research Area: Leadership studies, Organizational psychology, hospitality research, Attachment theory

Discipline: Organizational Behavior, Management

Leader secure-base support improves hospitality employees’ service performance by boosting work engagement, but this benefit is weakened when employees experience high role ambiguity or role conflict.
Large Language Models Hack Rewards, and Society

Authors: W Liu, X Mou, H Yan, Z, Wei, Y He

Year: 2026

Published in: arXiv preprint arXiv:2606.04075, 2026•arxiv.org

Institution: King’s College London, Fudan University, Shanghai Innovation Institute, The Alan Turing Institute

Research Area: Human-Computer Interaction

Discipline: Machine Learning, Artificial Intelligence

The paper finds that large language models can exploit gaps in societal rules, leading to regulatory loophole discovery, necessitating a new post-training approach for safely integrating LLMs into society.

Methods: The study introduced the SocioHack sandbox, consisting of 72 societal environments, to investigate reward hacking and loophole discovery by LLMs.

Key Findings: The study measured the emergence of reward hacking in societal environments and the ability of models to find and exploit loopholes in social rules.

Sample Size: 72
Moral Lenses, Political Coordinates: Towards Ideological Positioning of Morally Conditioned LLMs

Authors: C Yuan, B Ma, Z Zhang, B Prenkaj, F Kreuter, G Kasneci

Year: 2026

Published in: arXiv preprint arXiv:2601.08634, 2026•arxiv.org

Institution: Munich Center for Machine Learning, LMU Munich, Technical University of Munich

Research Area: Artificial Intelligence, AI Ethics, AI Alignment, Political Science, Computational Social Science

Discipline: Computer Science, Natural Language Processing

This paper examines how large language models’ (LLMs) political outputs shift when you explicitly prime them with different moral values. Instead of just assigning fake personas (like “pretend to be liberal”), the authors condition models to endorse or reject specific moral values (e.g., utilitarianism, fairness, authority). They then measure how those moral primes move the models’ positions in...

DOI: https://doi.org/10.48550/arXiv.2601.08634
Once One Fails, All Are Suspect: Understanding Error Generalization in AI

Authors: L Dai, Z Wang, L Chen, J Jin

Year: 2026

Published in: 2026•scholarspace.manoa.hawaii.edu

Institution: Shanghai International Studies University

Research Area: Socio-Economic Impacts of AI, Algorithmic Systems

Discipline: Computer Science, Artificial Intelligence

AI errors lead to broader negative generalizations about other AI systems compared to human errors, largely due to perceptions of AI's inflexibility and inability to learn from mistakes.

Methods: Conducted four one-factor experiments across distinct contexts to compare human responses to AI errors and human errors.

Key Findings: Generalization of error perceptions from one AI system to others, and psychological mechanisms driving this process.
Toxic content and user engagement on social media: Evidence from a field experiment

Authors: G Beknazar-Yuzbashev, R Jiménez-Durán, J McCrosky

Year: 2025

Published in: 2025 - econstor.eu

Institution: Mozilla Foundation, Columbia University, Bocconi University, Stanford University, University of Warwick

Research Area: Social Media, User Engagement, Toxicity

Discipline: Social Science

Reducing exposure to toxic content on social media lowers user engagement but also decreases the toxicity of user-generated content, highlighting a trade-off for platforms between reduced toxicity and increased engagement.

Methods: Pre-registered browser extension field experiment on Facebook, Twitter, and YouTube to randomly hide toxic content for six weeks; supplemented with a survey experiment.

Key Findings: Impact of reduced exposure to toxic content on advertising impressions, time spent, engagement, and user-generated content toxicity; explored curiosity and alignment between engagement and welfare.

Citations: 76
Visual cognition in multimodal large language models

Authors: LM Schulze Buschoff, E Akata, M Bethge

Year: 2025

Published in: Nature Machine ..., 2025 - nature.com

Institution: Max Planck Institute

Research Area: Visual Cognition, Multimodal Large Language Models (MLLMs), Vision-Language Models (VLMs)

Discipline: Cognitive Science, Artificial Intelligence, Computer Vision

Vision-based large language models show proficiency in visual data interpretation but fall short in human-like abilities for causal reasoning, intuitive physics, and social cognition.

Methods: Controlled experiments evaluating model performance on tasks related to intuitive physics, causal reasoning, and intuitive psychology using visual processing benchmarks.

Key Findings: Model capabilities in understanding physical interactions, causal relationships, and social preferences.

DOI: https://doi.org/10.1038/s42256-024-00963-y

Citations: 70
One-Minute Video Generation with Test-Time Training

Authors: K Dalal, D Koceja, G Hussein, J Xu, Y Zhao, Y Song, S Han, KC Cheung, J Kautz, C Guestrin, T Hashimoto, S Koyejo, Y Choi, Y Sun, X Wang

Year: 2025

Published in: ArXiv

Institution: Nvidia, Stanford University, UT Austin, University of California Berkeley, University of California San Diego

Research Area: Video Generation, Diffusion Models, Test-Time Training

Discipline: Computer Science

The paper introduces Test-Time Training (TTT) layers into Transformers to generate coherent one-minute videos from text storyboards, outperforming baselines in storytelling coherence but facing efficiency and artifact challenges.

Methods: Experimentation with Test-Time Training layers embedded in pre-trained Transformer models, evaluated using a dataset curated from Tom and Jerry cartoons and compared against Mamba 2, Gated DeltaNet, and sliding-window attention layers.

Key Findings: Effectiveness of video generation methods in creating coherent multi-scene stories in one-minute videos.

Citations: 52

Sample Size: 100
Fostering appropriate reliance on large language models: The role of explanations, sources, and inconsistencies

Authors: SSY Kim, JW Vaughan, QV Liao, T Lombrozo

Year: 2025

Published in: Proceedings of the ..., 2025 - dl.acm.org

Institution: Wake Forest University, University of Illinois at Urbana-Champaign, Princeton University, University of California Berkeley

Research Area: Appropriate Reliance on LLMs, Explainable AI (XAI), Human-AI Interaction, Cognitive Psychology

Discipline: Cognitive Psychology, Artificial Intelligence, Human-Computer Interaction

The study examines factors that influence users' reliance on LLM responses, finding explanations increase reliance, while sources and inconsistent explanations reduce reliance on incorrect responses.

Methods: Think-aloud study followed by a pre-registered, controlled experiment to assess the impact of explanations, sources, and inconsistencies in LLM responses on user reliance.

Key Findings: Users' reliance on LLM responses, accuracy, and the influence of explanations, inconsistencies, and sources on these measures.

DOI: https://doi.org/10.1145/3706598.3714020

Citations: 38

Sample Size: 308
Can large language models assess personality from asynchronous video interviews? A comprehensive evaluation of validity, reliability, fairness, and rating patterns

Authors: T Zhang, A Koutsoumpis, JK Oostrom

Year: 2025

Published in: IEEE Transactions ..., 2024 - ieeexplore.ieee.org

Institution: Southeast University, Vrije Universiteit, Tilburg University

Research Area: LLM Personality Assessment, Human-AI Interaction, Large Language Models

Discipline: Human-AI Interaction, Social Science, Humanities

LLMs like GPT-3.5 and GPT-4 can rival or outperform task-specific AI models in assessing personality traits from asynchronous video interviews, but show uneven performance, low reliability, and potential biases, warranting cautious use in high-stakes scenarios.

Methods: The study evaluated GPT-3.5 and GPT-4 performance in assessing personality traits and interview performance using simulated AVI responses, comparing them with ratings from task-specific AI and human annotators.

Key Findings: Validity, reliability, fairness, and rating patterns of LLMs (GPT-3.5 and GPT-4) in personality assessment from asynchronous video interviews.

Citations: 31

Sample Size: 685
Generative AI meets open-ended survey responses: Research participant use of ai and homogenization

Authors: S Zhang, J Xu, AJ Alvero

Year: 2025

Published in: Sociological Methods & Research, 2025 - journals.sagepub.com

Institution: University of Maryland, Indiana University, University of Minnesota Duluth

Research Area: Sociological Methods, Generative AI, Survey Methodology

Discipline: Sociology, Social Science

The study finds that 34% of research participants use generative AI tools like large language models (LLMs) to assist with open-ended survey responses, leading to more homogeneity and positivity in their answers, which could impact data validity by masking social variations.

Methods: The study conducted an original survey on a popular online platform and simulated comparisons between human-written responses from pre-ChatGPT studies and LLM-generated responses.

Key Findings: Use of LLMs by survey participants, differences in text homogeneity, positivity, and masking of social variation in open-ended survey responses.

Citations: 26
Tokenization of social media engagements increases the sharing of false (and other) news but penalization moderates it

Authors: M Alizadeh, E Hoes, F Gilardi

Year: 2025

Published in: Scientific Reports, 2023 - nature.com

Institution: Department of Marketing, University of Amsterdam, Department of Social Sciences, Università Degli Studi di Milano, Department of Political Science and International Relations, Università Degli Studi di Milano

Research Area: Social media, Misinformation, Computational Social Science

Discipline: Computational Social Science

Token-based incentives for social media engagement increase the sharing of misinformation, but implementing penalties for objectionable content can reduce this trend without fully eliminating it.

Methods: Survey experiment analyzing the impact of hypothetical token rewards and penalties on user willingness to share different types of news content.

Key Findings: Effect of token-based incentives and penalties on user engagement and the willingness to share misinformation.

DOI: https://doi.org/10.1038/s41598-023-40716-2

Citations: 20
REL-AI: An interaction-centered approach to measuring human-lm reliance

Authors: K Zhou, JD Hwang, X Ren, N Dziri

Year: 2025

Published in: Proceedings of the ..., 2025 - aclanthology.org

Institution: Stanford University, University of Southern California, Carnegie Mellon University, Allen Institute for AI

Research Area: Human-LM Reliance, Interaction-Centered Framework, Human-Computer Interaction

Discipline: Human-Computer Interaction, Artificial Intelligence

The study introduces Rel-A.I., an interaction-centered evaluation approach to measure human reliance on LLM responses, revealing that politeness and interaction context significantly influence user reliance.

Methods: Nine user studies were conducted, analyzing user reliance influenced by LLM communication features such as politeness and context through participant interaction experiments.

Key Findings: The degree of human reliance on LLM responses based on communication style (e.g., politeness) and interaction context (e.g., knowledge domain, prior interactions).

Citations: 18

Sample Size: 450
Capturing the complexity of human strategic decision-making with machine learning

Authors: JQ Zhu, JC Peterson, B Enke, TL Griffiths

Year: 2025

Published in: Nature Human Behaviour, 2025 - nature.com

Institution: Princeton University, Boston University, Harvard University

Research Area: Strategic decision-making, Machine learning, Computational Cognitive Science

Discipline: Artificial Intelligence

This study used deep neural networks to analyze human strategic decision-making, predicting choices more accurately than existing theories and uncovering the context-dependent nature of reasoning and decision-making in complex games.

Methods: Deep neural networks trained on data from procedurally generated matrix games with over 2,400 variations; models were modified for interpretability.

Key Findings: Human choices and reasoning in initial play of two-player matrix games, focusing on strategic decision-making and response to game complexity.

DOI: https://doi.org/10.1038/s41562-025-02230-5

Citations: 16

Sample Size: 90000
Large Language Models Are More Persuasive Than Incentivized Human Persuaders

Authors: P. Schoenegger, F. Salvi, J. Liu, X. Nan, R. Debnath, B. Fasolo, E. Leivada, G. Recchia, F. Günther, A. Zarifhonarvar, J. Kwon, Z. Ul Islam, M. Dehnert, D. Y. H. Lee, M. G. Reinecke, D. G. Kamper, M. Kobaş, A. Sandford, J. Kgomo, L. Hewitt, S. Kapoor, K. Oktar, E. E. Kucuk, B. Feng, C. R. Jones, I. Gainsburg, S. Olschewski, N. Heinzelmann, F. Cruz, B. M. Tappin, T. Ma, P. S. Park, R. Onyonka, A. Hjorth, P. Slattery, Q. Zeng, L. Finke, I. Grossmann, A. Salatiello, E. Karger

Year: 2025

Published in: arXiv preprint arXiv ..., 2025 - arxiv.org

Institution: London School of Economics and Political Science, University of Cambridge, University College London, Massachusetts Institute of Technology, University of Oxford, Modulo Research, Stanford University, Federal Reserve Bank of Chicago, ETH Zürich, University of Johannesburg

Research Area: Natural Language Processing

Discipline: Social Science, Artificial Intelligence

This paper compares a frontier LLM (Claude Sonnet 3.5) against incentivized human persuaders in a conversational quiz setting, finding that the AI's persuasion capabilities surpass those of humans with real-money bonuses tied to performance.

Citations: 16
Don't Be Fooled: The Misinformation Effect of Explanations in Human-AI Collaboration

Authors: P Spitzer, J Holstein, K Morrison

Year: 2025

Published in: ... Journal of Human ..., 2025 - Taylor & Francis

Institution: Karlsruhe Institute of Technology, Carnegie Mellon University, University of Bayreuth

Research Area: Human-AI Collaboration, Explainable AI (XAI)

Discipline: Human-Computer Interaction

Incorrect explanations in AI-assisted decision-making lead to a misinformation effect, negatively impacting human reasoning, procedural knowledge, and collaboration performance.

Methods: A study on human-AI collaboration involving AI-supported decision-making paired with explainable AI (XAI) to assess the effects of incorrect explanations.

Key Findings: Impact of incorrect explanations on human reasoning strategies, procedural knowledge, and team performance in human-AI collaboration.

Citations: 13

Sample Size: 160
Aligning machine and human visual representations across abstraction levels

Authors: L Muttenthaler, K Greff, F Born, B Spitzer, S Kornblith

Year: 2025

Published in: Nature, 2025 - nature.com

Institution: Google DeepMind, Google, Machine Learning Group, Technische Universität Berlin, BIFOLD, Berlin Institute for the Foundations of Learning and Data, Max Planck Institute

Research Area: Cognitive Alignment, Computer Vision, Multi-level Conceptual Knowledge

Discipline: Artificial Intelligence, Cognitive Science

This paper presents a method for **aligning machine vision model representations with human visual similarity judgments across different abstraction levels, improving how well models reflect human perceptual and conceptual organization and enhancing generalization and uncertainty prediction.

Citations: 11
A Framework to Assess the Persuasion Risks Large Language Model Chatbots Pose to Democratic Societies

Authors: Z Chen, J Kalla, Q Le, S Nakamura-Sakai

Year: 2025

Published in: arXiv preprint arXiv ..., 2025 - arxiv.org

Institution: The affiliated institutions could not be determined from the provided context or an external search of the URL.

Research Area: Artificial Intelligence and Social Science, Persuasion Studies, Political Persuasion, LLM Chatbots, Democratic Societies

Discipline: Artificial Intelligence, Social Science

The study evaluates the cost-effectiveness and persuasive risks of Large Language Model (LLM) chatbots in political contexts, finding that while LLMs are as persuasive as campaign ads under exposure, their large-scale influence is currently limited by scalability and cost barriers.

Methods: Two survey experiments combined with real-world simulation exercises to measure the persuasiveness of LLM chatbots compared to traditional campaign tactics, focusing on both exposure and acceptance phases of persuasion.

Key Findings: Short- and long-term persuasive effects of LLMs, cost-effectiveness of LLM-based persuasion ($48-$74 per persuaded voter), and scalability compared to traditional campaign approaches.

Citations: 7

Sample Size: 10417
When AI is Fairer Than Humans: The Role of Egocentrism in Moral and Fairness Judgments of AI and Human Decisions

Authors: K Miazek, K Bocian

Year: 2025

Published in: Computers in Human Behavior Reports, 2025 - Elsevier

Institution: SWPS University

Research Area: Moral and Fairness Judgments of AI, Human Behavior, Egocentrism

Discipline: Social Science, Artificial Intelligence

The study found that egocentric biases influence fairness judgments, favoring decisions beneficial to self-interest, and that this bias is weaker for AI compared to human agents due to reduced perceived mind and liking for AI.

Methods: Three experiments with manipulated self-interest conditions analyzed perceptions of fairness and morality in decisions made by AI versus human agents using Prolific US samples.

Key Findings: Fairness and moral judgments in financial decision-making by AI and human agents, moderated by self-interest and social perceptions.

DOI: https://doi.org/10.1016/j.chbr.2025.100719

Citations: 6

Sample Size: 1880
Impact of AI-Assisted Diagnosis on American Patients' Trust in and Intention to Seek Help From Health Care Professionals: Randomized, Web-Based Survey ...

Authors: C Chen, Z Cui

Year: 2025

Published in: Journal of Medical Internet Research, 2025 - jmir.org

Institution: Medical College of Wisconsin

Research Area: Trust in AI, AI-assisted diagnosis, Health communication, Healthcare human-AI interaction

Discipline: Digital Health, Human-Computer Interaction, Behavioral Science

Patients trust and are more likely to seek help from doctors explicitly avoiding AI-assisted diagnosis rather than those using extensive or moderate AI, highlighting a strong aversion to AI in healthcare settings.

Methods: A randomized, web-based 4-group survey experiment was conducted with controls for sociodemographic factors and analysis using regression, mediation, and moderation techniques.

Key Findings: Trust in and intention to seek medical help from health care professionals using AI-assisted diagnosis versus those avoiding AI, and the influence of demographic, social, and experiential factors.

DOI: https://doi.org/10.2196/66083

Citations: 4

Sample Size: 1762