AI-induced indifference: Unfair AI reduces prosociality
Abstract
The growing prevalence of artificial intelligence (AI) in our lives has brought the impact of AI-based decisions on human judgments to the forefront of academic scholarship and public debate. Despite growth in research on people's receptivity towards AI, little is known about how interacting with AI shapes subsequent interactions among people. We explore this question in the context of unfair decisions determined by AI versus humans and focus on the spillover effects of experiencing such decisions on the propensity to act prosocially. Four experiments (combined *N* = 2425) show that receiving an unfair allocation by an AI (versus a human) actor leads to lower rates of prosocial behavior towards other humans in a subsequent decision---an effect we term *AI-induced indifference*. In Experiment 1, after receiving an unfair monetary allocation by an AI (versus a human) actor, people were less likely to act prosocially, defined as punishing an unfair human actor at a personal cost in a subsequent, unrelated decision. Experiments 2a and 2b provide evidence for the underlying mechanism: People blame AI actors less than their human counterparts for unfair behavior, decreasing people's desire to subsequently sanction injustice by punishing the unfair actor. In an incentive-compatible design, Experiment 3 shows that AI-induced indifference manifests even when the initial unfair decision and subsequent interaction occur in different contexts. These findings illustrate the spillover effect of human-AI interaction on human-to-human interactions and suggest that interacting with unfair AI may desensitize people to the bad behavior of others, reducing their likelihood to act prosocially. Implications for future research are discussed.
Study specs
- Institution
- The Hong Kong University of Science and Technology,China Europe International Business School,University of South Florida,HEC Paris,University of Mannheim
- Discipline
- Social Psychology,Behavioral Science
- Year
- 2022
- Human Data Platform
- Prolific
- Source
- View Source DOI Google Scholar
Peer Review & Critical Discussion
Potential Selection Bias in 2023 Cohort
The participant pool shows a concerning overrepresentation of users from high-income demographics. Looking at Table 3, we can see that 78% of respondents had annual incomes above $75k, which significantly limits the generalizability of these findings to broader populations.
Non-naive Participants Issue
I've noticed a methodological concern regarding participant naivety. Given that Prolific users often complete multiple studies, there's a real risk that participants had prior exposure to similar experimental paradigms, which could confound the results.
RLHF Applicability to This Study Design
The implications for RLHF training pipelines are understated. If we accept the authors' conclusions about preference stability, this has direct consequences for how we should structure reward model training. The temporal decay effect described in Section 4.2 is particularly relevant.
Verify your expertise to join discussion
Create an account and verify your credentials to participate in peer discussions.