Google Deepmind: Research Institution — Prolific Citations Library

Browse 12 peer-reviewed papers from Google Deepmind spanning AI Bias, LLM (2023–2026). Research powered by Prolific's high-quality participant data.

This page lists 12 peer-reviewed papers from researchers at Google Deepmind in the Prolific Citations Library, a curated collection of research powered by high-quality human data from Prolific.

Papers (12 of 12)

Bayesian teaching enables probabilistic reasoning in large language models

Authors: L Qiu, F Sha, K Allen, Y Kim, T Linzen, S van Steenkiste

Year: 2026

Published in: Nature …, 2026 - nature.com

Institution: Meta, Google DeepMind, Massachusetts Institute of Technology, Google Research, Google

Research Area: Probabilistic reasoning, Bayesian cognition, Neural language models, Reasoning, AI Evaluations

Discipline: Machine learning, Artificial intelligence

This paper sits at the intersection of machine learning and computational cognitive science, showing that large language models can acquire generalized probabilistic reasoning by being trained to imitate Bayesian belief updating rather than relying on prompting or heuristics.

Citations: 8
Multi-turn Evaluation of Anthropomorphic Behaviours in Large Language Models

Authors: L Ibrahim, C Akbulut, R Elasmar, C Rastogi, M Kahng, MR Morris, KR McKee, V Rieser, M Shanahan, L Weidinger

Year: 2025

Published in: arXiv preprint arXiv:2502.07077, 2025•arxiv.org

Institution: Google DeepMind, Google, University of Oxford

Research Area: Multimodal conversational AI, conversational AI, Evaluation methodology, benchmarking

Discipline: Computer Science, Natural Language Processing (NLP), Human–Computer Interaction (HCI)

The paper evaluates anthropomorphic behaviors in SOTA LLMs through a multi-turn methodology, showing that such behaviors, including empathy and relationship-building, predominantly emerge after multiple interactions and influence user perceptions.

Methods: Multi-turn evaluation of 14 anthropomorphic behaviors using simulations of user interactions, validated by a large-scale human subject study.

Key Findings: Anthropomorphic behaviors in large language models, including relationship-building and pronoun usage, and their perception by users.

Citations: 26

Sample Size: 1101
Aligning machine and human visual representations across abstraction levels

Authors: L Muttenthaler, K Greff, F Born, B Spitzer, S Kornblith

Year: 2025

Published in: Nature, 2025 - nature.com

Institution: Google DeepMind, Google, Machine Learning Group, Technische Universität Berlin, BIFOLD, Berlin Institute for the Foundations of Learning and Data, Max Planck Institute

Research Area: Cognitive Alignment, Computer Vision, Multi-level Conceptual Knowledge

Discipline: Artificial Intelligence, Cognitive Science

This paper presents a method for **aligning machine vision model representations with human visual similarity judgments across different abstraction levels, improving how well models reflect human perceptual and conceptual organization and enhancing generalization and uncertainty prediction.

Citations: 11
ACM Journal on Responsible Computing | Volume 2, Number 4

Authors: J van Grunsven, N Jacobs, BA Kamphorst, M Honauer

Year: 2025

Published in: ACM Journal on, 2025 - dl.acm.org

Institution: University of Texas, Microsoft Research, Google DeepMind, Google, University of Washington, World Economic Forum

Research Area: Ethics and Governance of Computing Research, focused on Responsible Computing, Social Science Research, Artificial Intelligence.

Discipline: Ethics, Governance of Computing Research, AI Ethics

The paper emphasizes the importance of accounting for human vulnerability in the design and analysis of digital technologies, proposing concepts like 'Intimate Computing' to empower individuals in managing their technology-mediated vulnerabilities.

Methods: The study reviews and synthesizes existing literature and frameworks addressing vulnerability in human-technology interactions, including concepts like 'Intimate Computing' and 'Person-Machine Teaming'.

Key Findings: Human vulnerability in the context of digitally-mediated interactions and the role of computing frameworks in addressing them.

Citations: 2
Deliberate Lab: A Platform for Real-Time Human-AI Social Experiments

Authors: C Qian, V Tsai, M Behr, N Hussein, L Laugier, N Thain, L Dixon

Year: 2025

Published in: ArXiv

Institution: Google, Google DeepMind, EPFL

Research Area: Human-AI Interaction, Social Experiments, Platform Design

Discipline: Computational Social Science

Deliberate Lab is an open-source platform designed to enable real-time, multi-user human and AI (LLM) experiments. Developed by DeepMind researchers, it supports synchronous interaction, custom experimental stages, and integrates with platforms like Prolific for streamlined participant recruitment and payment. The system has been successfully used in over 600 experiments with more than 9,000 pa...

Citations: 1
To Mask or to Mirror: Human-AI Alignment in Collective Reasoning

Authors: C Qian, AT Parisi, C Bouleau, V Tsai

Year: 2025

Published in: Proceedings of the ..., 2025 - aclanthology.org

Institution: Google, Google DeepMind

Research Area: Human-AI Alignment, Collective Reasoning, Social Biases, LLM Simulation of Human Behavior, AI Bias

Discipline: Natural Language Processing, Artificial Intelligence, Computational Social Science

This study examines human-AI alignment in collective reasoning using an empirical framework, demonstrating how LLMs either mirror or mask human biases depending on context, cues, and model-specific inductive biases.

Methods: The study uses the Lost at Sea social psychology task in a large-scale online experiment, simulating LLM groups conditioned on human decision-making data across varying conditions of visible or pseudonymous demographics.

Key Findings: Alignment of LLM behavior with human social reasoning, focusing on collective decision-making and biases in group interactions.

Citations: 1

Sample Size: 748
Whose view of safety? a deep dive dataset for pluralistic alignment of text-to-image models

Authors: C Rastogi, TH Teh, P Mishra, R Patel, D Wang, M Díaz, A Parrish, AM Davani, Z Ashwood

Year: 2025

Published in: arXiv preprint arXiv:2507.13383, 2025•arxiv.org

Institution: Google DeepMind, Google Research, Google

Research Area: AI alignment, safety evaluation, AI Safety, Multimodal evaluation, Human–AI interaction, LLM

Discipline: Computer Science, Machine Learning, Artificial Intelligence

This research introduces the DIVE dataset to enable pluralistic alignment in text-to-image models by accounting for diverse safety perspectives, revealing demographic variations in harm perception and advancing T2I model alignment strategies.

Methods: The study involved collecting feedback across 1000 prompts from demographically intersectional human raters to capture diverse safety perspectives, with an emphasis on empirical and contextual differences in harm perception.

Key Findings: Safety perceptions of text-to-image (T2I) model outputs from diverse demographic viewpoints and the influence of these perspectives on alignment strategies.

Citations: 1

Sample Size: 1000
Gemma 2: Improving Open Language Models at a Practical Size

Authors: Gemma Team

Year: 2024

Published in: ArXiv

Institution: Google DeepMind, Google

Research Area: LLM, Model Efficiency, Architecture

Discipline: Artificial Intelligence

Gemma 2 introduces scalable Transformer-based language models (2B-27B parameters) enhanced with techniques like local-global and group-query attention, achieving state-of-the-art performance for their size and competing with larger models.

Methods: The study applied modifications to the Transformer architecture, such as local-global attentions and group-query attention, as well as knowledge distillation training for select model sizes.

Key Findings: Performance of lightweight language models in terms of efficiency and competitiveness with larger models.

Citations: 1649
A Robot Walks into a Bar: Can Language Models Serve as Creativity Support Tools for Comedy? An Evaluation of LLMs’ Humour Alignment with Comedians

Authors: PW Mirowski, J Love, K Mathewson, S Mohamed

Year: 2024

Published in: ArXiv

Institution: Google DeepMind, Google

Research Area: AI Creativity, Humor Generation, Human-Computer Interaction (HCI)

Discipline: Artificial Intelligence

Professional comedians found LLMs insufficient as creativity support tools for comedy, citing bias, bland output, and reinforcement of hegemonic viewpoints.

Methods: Workshops conducted with professional comedians combining comedy writing sessions using LLMs, a Creativity Support Index questionnaire, and focus groups discussing their experiences and ethical concerns.

Key Findings: Effectiveness of LLMs as creativity support tools for comedy writing, ethical concerns (bias, censorship, copyright), and value alignment in AI outputs.

Citations: 52

Sample Size: 20
First-person fairness in chatbots

Authors: T Eloundou, A Beutel, DG Robinson

Year: 2024

Published in: arXiv preprint arXiv ..., 2024 - arxiv.org

Institution: OpenAI, Google DeepMind, Google, University of Oxford

Research Area: Fairness in LLM, AI Bias, AI Ethics

Discipline: Artificial Intelligence, Social Science

The paper introduces a counterfactual approach to evaluate 'first-person fairness' in chatbots, demonstrating that reinforcement learning can mitigate biases based on demographics across extensive chatbot interactions.

Methods: The study uses a Language Model as a Research Assistant (LMRA) to quantitatively and qualitatively assess biases based on demographics across millions of chatbot interactions, covering 66 tasks in 9 domains and involving two genders and four races. Bias evaluations are corroborated by independent...

Key Findings: Demographic biases in chatbot responses, including harmful stereotypes and response differences by gender and race, across diverse tasks and domains.

DOI: https://doi.org/10.48550/arXiv.2410.19803

Citations: 33

Sample Size: 6000000
Annotator in the Loop: A Case Study of In-Depth Rater Engagement to Create a Prosocial Benchmark Dataset

Authors: S Schmer-Galunder, R Wheelock, Z Jalan

Year: 2024

Published in: Proceedings of the ..., 2024 - ojs.aaai.org

Institution: Google DeepMind, Google, Accenture, Amazon

Research Area: AI Ethics and Prosocial Data Annotation

Discipline: Artificial Intelligence, Ethics, Behavioral Science

DOI: https://doi.org/10.1609/aies.v7i1.31726

Citations: 3
Personalisation within bounds: A risk taxonomy and policy framework for the alignment of large language models with personalised feedback

Authors: HR Kirk, B Vidgen, P Röttger, SA Hale

Year: 2023

Published in: arXiv preprint arXiv:2303.05453, 2023 - arxiv.org

Institution: The Alan Turing Institute, University of Oxford, Imperial College London, King's College London, Google DeepMind

Research Area: Large Language Model Alignment, Safety, Personalization Risks

Discipline: Artificial Intelligence

DOI: https://doi.org/10.48550/arXiv.2303.05453

Citations: 146