Natural Language Processing Research

This page lists 17 peer-reviewed papers in the discipline of Natural Language Processing in the Prolific Citations Library, a curated collection of research powered by high-quality human data from Prolific.

Papers (17 of 17)

Moral Lenses, Political Coordinates: Towards Ideological Positioning of Morally Conditioned LLMs

Authors: C Yuan, B Ma, Z Zhang, B Prenkaj, F Kreuter, G Kasneci

Year: 2026

Published in: arXiv preprint arXiv:2601.08634, 2026•arxiv.org

Institution: Munich Center for Machine Learning, LMU Munich, Technical University of Munich

Research Area: Artificial Intelligence, AI Ethics, AI Alignment, Political Science, Computational Social Science

Discipline: Computer Science, Natural Language Processing

This paper examines how large language models’ (LLMs) political outputs shift when you explicitly prime them with different moral values. Instead of just assigning fake personas (like “pretend to be liberal”), the authors condition models to endorse or reject specific moral values (e.g., utilitarianism, fairness, authority). They then measure how those moral primes move the models’ positions in...

DOI: https://doi.org/10.48550/arXiv.2601.08634
Multi-turn Evaluation of Anthropomorphic Behaviours in Large Language Models

Authors: L Ibrahim, C Akbulut, R Elasmar, C Rastogi, M Kahng, MR Morris, KR McKee, V Rieser, M Shanahan, L Weidinger

Year: 2025

Published in: arXiv preprint arXiv:2502.07077, 2025•arxiv.org

Institution: Google DeepMind, Google, University of Oxford

Research Area: Multimodal conversational AI, conversational AI, Evaluation methodology, benchmarking

Discipline: Computer Science, Natural Language Processing, Human-Computer Interaction

The paper evaluates anthropomorphic behaviors in SOTA LLMs through a multi-turn methodology, showing that such behaviors, including empathy and relationship-building, predominantly emerge after multiple interactions and influence user perceptions.

Methods: Multi-turn evaluation of 14 anthropomorphic behaviors using simulations of user interactions, validated by a large-scale human subject study.

Key Findings: Anthropomorphic behaviors in large language models, including relationship-building and pronoun usage, and their perception by users.

Citations: 26

Sample Size: 1101
Trick or Neat: Adversarial Ambiguity and Language Model Evaluation

Authors: A Karamolegkou, O Eberle, P Rust, C Kauf, A Søgaard

Year: 2025

Published in: ArXiv

Institution: Aleph Alpha, Massachusetts Institute of Technology

Research Area: Adversarial Ambiguity, Language Model Evaluation, Artificial Intelligence, Natural Language Processing, Large Language Models, AI Evaluation, Red Teaming

Discipline: Natural Language Processing

The paper assesses language models' sensitivity to ambiguity using an adversarial dataset and finds that direct prompting poorly identifies ambiguity, while linear probes achieve high accuracy in decoding ambiguity from model representations.

Methods: An adversarial ambiguity dataset was introduced with various types of ambiguities and transformations; models were tested using direct prompts and linear probes trained on internal representations.

Key Findings: Language models' ability to detect ambiguity, including syntactic, lexical, and phonological types, as well as performance under adversarial variations.

Citations: 2
Real-World Summarization: When Evaluation Reaches Its Limits

Authors: P Schmidtová, O Dušek, S Mahamood

Year: 2025

Published in: ArXiv

Institution: Charles University, Trivago

Research Area: Summarization evaluation, Natural Language Processing, LLM-as-a-Judge, AI Evaluation

Discipline: Natural Language Processing

Simpler metrics like word overlap surprisingly correlate well with human judgments in summarization evaluation, outperforming complex methods in out-of-domain applications, though LLMs remain unreliable for assessment due to annotation biases.

Methods: Human evaluation campaigns with categorical error assessment, span-level annotations, and comparison of traditional metrics, trainable models, and LLM-as-a-judge approaches.

Key Findings: Effectiveness of summarization evaluation methods and their correlation with human judgment, along with business impacts of incorrect information in generated summaries.

Citations: 1
To Mask or to Mirror: Human-AI Alignment in Collective Reasoning

Authors: C Qian, AT Parisi, C Bouleau, V Tsai

Year: 2025

Published in: Proceedings of the ..., 2025 - aclanthology.org

Institution: Google, Google DeepMind

Research Area: Human-AI Alignment, Collective Reasoning, Social Biases, LLM Simulation of Human Behavior, AI Bias

Discipline: Natural Language Processing, Artificial Intelligence, Computational Social Science

This study examines human-AI alignment in collective reasoning using an empirical framework, demonstrating how LLMs either mirror or mask human biases depending on context, cues, and model-specific inductive biases.

Methods: The study uses the Lost at Sea social psychology task in a large-scale online experiment, simulating LLM groups conditioned on human decision-making data across varying conditions of visible or pseudonymous demographics.

Key Findings: Alignment of LLM behavior with human social reasoning, focusing on collective decision-making and biases in group interactions.

Citations: 1

Sample Size: 748
LLM-based Semantic Augmentation for Harmful Content Detection

Authors: Elyas Meguellati1, Assad Zeghina2, Shazia Sadiq1, Gianluca Demartini1

Year: 2025

Published in: ArXiv

Institution: University of Queensland, University of Strasbourg

Research Area: Natural Language Processing, Harmful Content Detection

Discipline: Natural Language Processing

The paper introduces an approach using LLM-based semantic augmentation for harmful content detection on social media, achieving performance comparable to human-annotated models but at reduced cost.

Methods: The researchers utilize LLMs to clean noisy text and generate explanations for context-rich preprocessing, then evaluate the augmented training sets on multiple high-context datasets such as SemEval 2024 Persuasive Meme, Google Jigsaw toxic comments, and Facebook hateful memes datasets.

Key Findings: The efficacy of LLM-based semantic augmentation in enhancing training sets for social media tasks such as propaganda detection, hateful meme classification, and toxicity identification.
Large Language Models can impersonate politicians and other public figures

Authors: S Herbold, A Trautsch, Z Kikteva, A Kaufman

Year: 2024

Published in: arXiv preprint arXiv ..., 2024 - arxiv.org

Institution: University of Passau

Research Area: Natural Language Processing, Artificial Intelligence, Machine Learning

Discipline: Artificial Intelligence, Political Science, Natural Language Processing

Citations: 7
Controlled Evaluation of Syntactic Knowledge in Multilingual Language Models

Authors: Daria Kryvosheieva

Year: 2024

Published in: ArXiv

Institution: Massachusetts Institute of Technology

Research Area: Natural Language Processing, AI Evaluation

Discipline: Natural Language Processing
Does Differential Privacy Impact Bias in Pretrained NLP Models?

Authors: Md. Khairul Islam1, Andrew Wang1, Tianhao Wang1, Yangfeng Ji1, Judy Fox 1, Jieyu Zhao2

Year: 2024

Published in: ArXiv

Institution: University of Virginia

Research Area: Differential Privacy, Bias Mitigation, Large Language Models, Natural Language Processing, AI Bias

Discipline: Artificial Intelligence, Natural Language Processing
Evaluating Creative Short Story Generation in Humans and Large Language Models

Authors: Mete Ismayilzada1,2, Claire Stevenson3, Lonneke van der Plas

Year: 2024

Published in: ArXiv

Institution: Idiap Research Institute, University of Amsterdam, Università della Svizzera Italiana, École Polytechnique Fédérale de Lausanne

Research Area: Creative Story Generation, LLM Evaluation, Computational Creativity

Discipline: Artificial Intelligence, Natural Language Processing, Computational Creativity
ImpScore: A Learnable Metric For Quantifying The Implicitness Level of Language

Authors: Yuxin Wang♣ Xiaomeng Zhu◆ Weimin Lyu♠∗ Saeed Hassanpour♣ Soroush Vosoughi♣

Year: 2024

Published in: ArXiv

Institution: Department of Computer Science Dartmouth College, Stony Brook University, Yale University

Research Area: Natural Language Processing, Computational Linguistics

Discipline: Natural Language Processing
Order Effects in Annotation Tasks: Further Evidence of Annotation Sensitivity

Authors: Jacob Beck, Stephanie Eckman, Bolei Ma, Rob Chew, Frauke Kreuter

Year: 2024

Published in: ACL Anthology

Institution: University of Maryland

Research Area: Annotation Sensitivity, Order Effects, Natural Language Processing, Social Science in AI

Discipline: Natural Language Processing, Computational Social Science
Spica: Retrieving Scenarios for Pluralistic In-Context Alignment

Authors: Quan Ze Chen K.J. Kevin Feng Chan Young Park Amy X. Zhang

Year: 2024

Published in: ArXiv

Institution: University of Washington

Research Area: In-Context Learning, Computational Linguistics, Natural Language Processing

Discipline: Computer Science, Computational Linguistics, Natural Language Processing
Using Language Models to Disambiguate Lexical Choices in Translation

Authors: J Barua, S Subramanian, K Yin, A Suhr

Year: 2024

Published in: ArXiv

Institution: University of California Berkeley

Research Area: Natural Language Processing, Machine Translation, Lexical Semantics

Discipline: Natural Language Processing
When do annotator demographics matter? measuring the influence of annotator demographics with the POPQUORN dataset

Authors: J Pei, D Jurgens

Year: 2023

Published in: arXiv preprint arXiv:2306.06826, 2023 - arxiv.org

Institution: University of Michigan, University of Toronto

Research Area: Natural Language Processing

Discipline: Natural Language Processing, Human-Computer Interaction

DOI: https://doi.org/10.48550/arXiv.2306.06826

Citations: 55
Large language models are not zero-shot communicators

Authors: LE Ruis, A Khan, S Biderman, S Hooker, T Rocktäschel

Year: 2022

Published in: 2022 - openreview.net

Institution: MILA, University of Toronto, Stanford University, Hugging Face, Imperial College London

Research Area: Natural Language Processing, Large Language Models, Communication

Discipline: Natural Language Processing

Citations: 52
Exploring cross-cultural differences in English hate speech annotations: From dataset construction to analysis

Authors: N Lee, C Jung, J Myung, J Jin

Year: 2021

Published in: Proceedings of the ..., 2024 - aclanthology.org

Institution: KAIST, Cardiff University

Research Area: Hate Speech Annotation, Cross-Cultural Bias, NLP Ethics

Discipline: Natural Language Processing, Computational Social Science

Citations: 44