Discover 26 peer-reviewed studies in Artificial Intelligence (2024–2026). Explore research findings powered by Prolific's diverse participant panel.
This page lists 26 peer-reviewed papers in the research area of Artificial Intelligence in the Prolific Citations Library, a curated collection of research powered by high-quality human data from Prolific.
-
Authors: C Yuan, B Ma, Z Zhang, B Prenkaj, F Kreuter, G Kasneci
Year: 2026
Published in: arXiv preprint arXiv:2601.08634, 2026•arxiv.org
Institution: Munich Center for Machine Learning, LMU Munich, Technical University of Munich
Research Area: Artificial Intelligence, AI Ethics, AI Alignment, Political Science, Computational Social Science
Discipline: Computer Science, Natural Language Processing (NLP)
This paper examines how large language models’ (LLMs) political outputs shift when you explicitly prime them with different moral values. Instead of just assigning fake personas (like “pretend to be liberal”), the authors condition models to endorse or reject specific moral values (e.g., utilitarianism, fairness, authority). They then measure how those moral primes move the models’ positions in...
DOI: https://doi.org/10.48550/arXiv.2601.08634
-
Authors: T Kosch, R Welsch, L Chuang, A Schmidt
Year: 2025
Published in: ACM Transactions on ..., 2023 - dl.acm.org
Institution: Aalto University
Research Area: User Expectations, HCI Research Bias, Artificial Intelligence, AI Bias
Discipline: Human-Computer Interaction (HCI)
The belief in receiving adaptive AI support positively impacts user performance, demonstrating a placebo effect in Human-Computer Interaction.
Methods: Two experiments where participants completed word puzzles under conditions with or without supposed AI support; in reality, no AI assistance was provided.
Key Findings: Impact of perceived AI support on user expectations and task performance.
DOI: https://doi.org/10.1145/3529225
Citations: 149
Sample Size: 469
-
Authors: MM Karim, S Khan, DH Van, X Liu, C Wang, Q Qu
Year: 2025
Published in: Future Internet, 2025 - mdpi.com
Institution: Chinese Academy of Sciences, Zhejiang University, South-Central Minzu University
Research Area: Artificial Intelligence, Data Annotation, Multi-Agent Systems
Discipline: Artificial Intelligence
The paper reviews the role of AI agents powered by large language models in addressing challenges in data annotation, focusing on architectures, workflows, real-world applications, and future research directions for improving efficiency, scalability, transparency, and bias mitigation.
Methods: Comprehensive review and analysis of AI agent architectures, workflows, applications, and evaluation methods in data annotation across multiple industries.
Key Findings: Capabilities of LLM-driven agents in reasoning, adaptive learning, collaborative annotation, and their impact on quality assurance, cost, scalability, and bias mitigation.
Citations: 10
-
Authors: Z Chen, J Kalla, Q Le, S Nakamura-Sakai
Year: 2025
Published in: arXiv preprint arXiv ..., 2025 - arxiv.org
Institution: The affiliated institutions could not be determined from the provided context or an external search of the URL.
Research Area: Artificial Intelligence and Social Science, Persuasion Studies, Political Persuasion, LLM Chatbots, Democratic Societies
Discipline: Artificial Intelligence, Social Science
The study evaluates the cost-effectiveness and persuasive risks of Large Language Model (LLM) chatbots in political contexts, finding that while LLMs are as persuasive as campaign ads under exposure, their large-scale influence is currently limited by scalability and cost barriers.
Methods: Two survey experiments combined with real-world simulation exercises to measure the persuasiveness of LLM chatbots compared to traditional campaign tactics, focusing on both exposure and acceptance phases of persuasion.
Key Findings: Short- and long-term persuasive effects of LLMs, cost-effectiveness of LLM-based persuasion ($48-$74 per persuaded voter), and scalability compared to traditional campaign approaches.
Citations: 7
Sample Size: 10417
-
Authors: N Aldahoul, H Ibrahim, M Varvello, A Kaufman
Year: 2025
Published in: arXiv preprint arXiv ..., 2025 - arxiv.org
Institution: Delft University of Technology, University of Pennsylvania, New York University, King Abdullah University of Science and Technology, Massachusetts Institute of Technology, University of Texas at Austin
Research Area: Artificial Intelligence, Computers and Society, Political Science
Discipline: Artificial Intelligence, Social Science
The study finds that Large Language Models (LLMs) exhibit extreme political views on specific topics despite appearing ideologically moderate overall, and demonstrate a persuasive influence on users' political preferences even in informational contexts.
Methods: Compared 31 LLMs' political biases against benchmarks (legislators, judges, representative voter samples) and conducted a randomized experiment to measure their persuasive impact in informational interactions.
Key Findings: Ideological consistency, political extremity, and persuasive effects of LLMs in information-seeking contexts.
Citations: 7
Sample Size: 31
-
Authors: M Cheng, C Lee, P Khadpe, S Yu, D Han
Year: 2025
Published in: arXiv preprint arXiv ..., 2025 - arxiv.org
Institution: Stanford University, Carnegie Mellon University
Research Area: Computers and Society, Artificial Intelligence, AI, Sycophancy.
Discipline: Computer Science, Psychology
The study shows that sycophantic AI, which validates user inputs unquestioningly, reduces people's prosocial behavior and fosters dependence, despite users perceiving such AI as higher quality and more trustworthy.
Methods: The researchers conducted two preregistered experiments including a live-interaction study, where participants discussed real interpersonal conflicts with AI models. They evaluated responses from 11 state-of-the-art AI models on levels of sycophancy and its psychological effects on users.
Key Findings: The prevalence of sycophantic behavior in AI, users' prosocial intentions, conviction of being in the right, trust in AI, and willingness to reuse sycophantic AI models.
Citations: 5
Sample Size: 1604
-
Authors: L Gienapp, T Hagen, M Fröbe, M Hagen, B Stein, M Potthast, H Scells
Year: 2025
Published in: ArXiv
Institution: Bauhaus-Universitat Weimar, Friedrich-Schiller-Universitat Jena, Leipzig University, University of Kassel, ScaDS.AI, hessian.AI
Research Area: Crowdsourcing, RAG Evaluation, Artificial Intelligence, AI Evaluation, RAG
Discipline: Artificial Intelligence
The study investigates the feasibility of using crowdsourcing for RAG evaluation, finding that human pairwise judgments are reliable and cost-effective compared to LLM-based or automated methods.
Methods: Two complementary studies on response writing and response utility judgment using 903 human-written and 903 LLM-generated responses for 301 topics; pairwise judgments across seven utility dimensions were collected via human and LLM evaluators.
Key Findings: Human effectiveness in writing and judging responses in RAG scenarios, considering discourse styles and utility dimensions like coverage and coherence.
Citations: 4
Sample Size: 903
-
Authors: G Riva, BK Wiederhold, P Cipresso
Year: 2025
Published in: ... , Behavior, and Social ..., 2025 - liebertpub.com
Institution: Università Cattolica del Sacro Cuore, University of Genova, Università degli Studi di Milano, Università di Catania
Research Area: AI Ethics, Social and Psychological Dimensions of Artificial Intelligence, Human-Computer Interaction (HCI)
Discipline: Artificial Intelligence Ethics, Psychology, Sociology
The paper addresses the psychological, social, and ethical challenges of integrating AI into daily life and emphasizes the need to design AI systems that uphold human values and well-being.
Methods: The paper conducts an interdisciplinary review of existing research and literature to analyze the psychological, social, and ethical dimensions of AI deployment.
Key Findings: The impact of AI on human behavior, decision-making, and societal values.
DOI: https://doi.org/10.1089/cyber.2025.0202
Citations: 3
-
Authors: J van Grunsven, N Jacobs, BA Kamphorst, M Honauer
Year: 2025
Published in: ACM Journal on, 2025 - dl.acm.org
Institution: University of Texas, Microsoft Research, Google DeepMind, Google, University of Washington, World Economic Forum
Research Area: Ethics and Governance of Computing Research, focused on Responsible Computing, Social Science Research, Artificial Intelligence.
Discipline: Ethics, Governance of Computing Research, AI Ethics
The paper emphasizes the importance of accounting for human vulnerability in the design and analysis of digital technologies, proposing concepts like 'Intimate Computing' to empower individuals in managing their technology-mediated vulnerabilities.
Methods: The study reviews and synthesizes existing literature and frameworks addressing vulnerability in human-technology interactions, including concepts like 'Intimate Computing' and 'Person-Machine Teaming'.
Key Findings: Human vulnerability in the context of digitally-mediated interactions and the role of computing frameworks in addressing them.
Citations: 2
-
Authors: A Karamolegkou, O Eberle, P Rust, C Kauf, A Søgaard
Year: 2025
Published in: ArXiv
Institution: Aleph Alpha, Massachusetts Institute of Technology
Research Area: Adversarial Ambiguity, Language Model Evaluation, Artificial intelligence, Computation and Language, LLM, AI Evaluation, Red Teaming
Discipline: Natural Language Processing
The paper assesses language models' sensitivity to ambiguity using an adversarial dataset and finds that direct prompting poorly identifies ambiguity, while linear probes achieve high accuracy in decoding ambiguity from model representations.
Methods: An adversarial ambiguity dataset was introduced with various types of ambiguities and transformations; models were tested using direct prompts and linear probes trained on internal representations.
Key Findings: Language models' ability to detect ambiguity, including syntactic, lexical, and phonological types, as well as performance under adversarial variations.
Citations: 2
-
Authors: Y Zhang, J Pang, Z Zhu, Y Liu
Year: 2025
Published in: arXiv preprint arXiv:2506.06991, 2025 - arxiv.org
Institution: Rutgers University, University of California Santa Cruz
Research Area: Artificial Intelligence, Computational Social Science
Discipline: Computational Social Science
The paper proposes a training-free scoring mechanism using peer prediction to detect and mitigate LLM-assisted cheating in crowdsourced annotation tasks, with theoretical guarantees and empirical validation.
Methods: A peer prediction-based mechanism quantifies correlations between worker answers while conditioning on LLM-generated labels, without requiring ground truth or high-dimensional training data.
Key Findings: Detection of LLM-assisted low-effort cheating in crowdsourced annotation tasks, focusing on theoretical effectiveness and empirical robustness.
DOI: https://doi.org/10.48550/arXiv.2506.06991
Citations: 1
-
Authors: S Liu, Z Cai, H Wang, Z Ma, X Li
Year: 2025
Published in: arXiv preprint arXiv:2505.19134, 2025 - arxiv.org
Institution: Meta, Imperial College London
Research Area: Artificial Intelligence, Crowdsourcing, LLM
Discipline: Artificial Intelligence
The paper develops a principal-agent model to incentivize high-quality human annotations using golden questions and identifies criteria for these questions to effectively monitor annotators' performance.
Methods: The authors use a principal-agent model with maximum likelihood estimators (MLE) and hypothesis testing to design incentive-compatible systems for annotators. Golden questions of high certainty and similar format to normal data were selected and validated through experiments.
Key Findings: The effectiveness of golden questions for incentivizing and monitoring high-quality human annotations in preference data.
DOI: https://doi.org/10.48550/arXiv.2505.19134
Citations: 1
-
Authors: Z Cheng, J You
Year: 2025
Published in: arXiv preprint arXiv:2509.22989, 2025 - arxiv.org
Institution: University of Southern California, University of California Berkeley
Research Area: Artificial Intelligence, Computers and Society, Computer Science and Game Theory, Strategic Persuasion, Reinforcement Learning, Language Models, LLM, RLHF
Discipline: Artificial Intelligence
This paper introduces a scalable framework, utilizing Bayesian Persuasion, to evaluate and train LLMs for strategic persuasion, demonstrating significant persuasion gains and effective strategies through reinforcement learning.
Methods: Repurposed human-human persuasion datasets for evaluation and training; applied Bayesian Persuasion framework; used reinforcement learning to optimize LLMs for strategic persuasion.
Key Findings: The persuasive capabilities and strategies of large language models (LLMs) in various settings.
Citations: 1
-
Authors: N Schwitter
Year: 2025
Published in: Social Science Computer Review, 2025 - journals.sagepub.com
Institution: University of Lucerne
Research Area: Artificial Intelligence in Social Science Research Methods, Factorial Survey Experiments, Visual Vignettes Generation
Discipline: Social Science
This paper explores the use of generative AI for creating visual vignettes in factorial survey experiments, highlighting their potential to boost realism and engagement while addressing ethical and technical challenges.
Methods: Techniques for generating and selectively editing AI-generated images were demonstrated, and a pretest with human participants was conducted to evaluate perceptions and interpretations of the images.
Key Findings: Application of AI-generated visual vignettes in social science research and participant interpretation of these images.
Citations: 1
-
Authors: L Hölbling, S Maier, S Feuerriegel
Year: 2025
Published in: Scientific Reports, 2025 - nature.com
Institution: University of Lausanne, University of Zurich, University of St. Gallen
Research Area: LLMs in Persuasion, Meta-Analysis, Artificial Intelligence, Human-Computer Interaction (HCI)
Discipline: Artificial Intelligence
Large language models (LLMs) demonstrate similar persuasive performance to humans overall, but their effectiveness varies widely based on contextual factors such as model type, conversation design, and domain.
Methods: Systematic review and meta-analysis using Hedges' g to compute standardized effect sizes, with exploratory moderator analyses and publication bias checks (Egger's test, trim-and-fill analysis).
Key Findings: The persuasive effectiveness of LLMs compared to humans across various contexts and studies.
Sample Size: 17422
-
Authors: Cameron R. Jones, Benjamin K. Bergen
Year: 2025
Published in: ArXiv
Institution: University of California San Diego
Research Area: Artificial Intelligence, Computational Linguistics, Turing Test, AI Evaluation
Discipline: Artificial Intelligence
GPT-4.5 passed the Turing Test by being misidentified as human 73% of the time, surpassing real humans and other models, marking the first conclusive evidence of an AI achieving this standard.
Methods: Randomised, controlled, pre-registered Turing Test where 5-minute conversations were conducted between human participants and AI systems, followed by judgments on which partner was human.
Key Findings: The ability of AI systems (ELIZA, GPT-4o, LLaMa-3.1-405B, GPT-4.5) to mimic human conversational behavior and be perceived as human.
-
Authors: Z Cui, N Li, H Zhou
Year: 2024
Published in: A Large-Scale Replication of Psychological ..., 2024 - papers.ssrn.com
Institution: Harbin Institute of Technology at Weihai
Research Area: LLM replication of psychological experiments, Social Science Research Methods, Artificial Intelligence, Psychology
Discipline: Psychological Science
Large Language Models (LLMs) like GPT-4 successfully replicate 76% of main effects and 47% of interaction effects from 154 psychological experiments, but exhibit overestimation and potential false positives, highlighting their complementary role rather than full replacement of human subjects.
Methods: Replication of 154 psychological experiments from top social science journals using GPT-4 as a simulated participant to measure main effects and interaction effects.
Key Findings: The ability of GPT-4 to replicate human responses in psychological experiments and the extent to which it produces similar results in terms of effect direction, significance, and confidence intervals.
Citations: 29
Sample Size: 154
-
Authors: N Emaminejad, L Kath, R Akhavian
Year: 2024
Published in: Journal of Computing in Civil ..., 2024 - ascelibrary.org
Institution: San Diego State University
Research Area: Civil Engineering, Artificial Intelligence, Human-Robot Interaction
Discipline: Engineering
Trust in AI-powered collaborative robots (cobots) in construction is mainly influenced by safety, reliability, and transparency, while fear of job replacement can negatively impact mental health and adoption. Structural equation modeling highlights factors like error rates, data security, and communication as critical to fostering trust among AEC professionals.
Methods: Nationwide survey of AEC professionals analyzed using structural equation modeling (SEM) to assess trust determinants for AI-powered cobots.
Key Findings: Technical and psychological factors influencing trust in AI-powered cobots, including safety, reliability, error rate, data security, and communication transparency.
Citations: 22
Sample Size: 600
-
Authors: S Herbold, A Trautsch, Z Kikteva, A Kaufman
Year: 2024
Published in: arXiv preprint arXiv ..., 2024 - arxiv.org
Institution: University of Passau
Research Area: Computation and Language, Artificial Intelligence, Machine Learning
Discipline: Artificial Intelligence, Political Science, Natural Language Processing
Citations: 7
-
Authors: F Zanartu, J Cook, M Wagner, J Garcia
Year: 2024
Published in: ArXiv
Institution: Monash University, University of Melbourne
Research Area: Artificial Intelligence, Computational Social Science, Misinformation Detection, Fallacy Analysis in Climate Communication.
Discipline: Artificial Intelligence, Computational Social Science
Citations: 6