AI can help people feel heard, but an AI label diminishes this impact

201 citations

Abstract

People want to "feel heard" to perceive that they are understood, validated, and valued. Can AI serve the deeply human function of making others feel heard? Our research addresses two fundamental issues: Can AI generate responses that make human recipients feel heard, and how do human recipients react when they believe the response comes from AI? We conducted an experiment and a follow-up study to disentangle the effects of actual source of a message and the presumed source. We found that AI-generated messages made recipients feel more heard than human-generated messages and that AI was better at detecting emotions. However, recipients felt less heard when they realized that a message came from AI (vs. human). Finally, in a follow-up study where the responses were rated by third-party raters, we found that compared with humans, AI demonstrated superior discipline in offering emotional support, a crucial element in making individuals feel heard, while avoiding excessive practical suggestions, which may be less effective in achieving this goal. Our research underscores the potential and limitations of AI in meeting human psychological needs. These findings suggest that while AI demonstrates enhanced capabilities to provide emotional support, the devaluation of AI responses poses a key challenge for effectively leveraging AI's capabilities.

201
Citations
Research
Paper Only

Study specs

Experiment and follow-up study to assess recipient reactions to AI vs. human-generated responses and determine emotional support efficacy.

Discipline
Social Science
Study Type
Experimental Study
Year
2024
Human Data Platform
Prolific

Measured Outcomes

The degree to which recipients feel heard, emotion detection accuracy, and third-party ratings of emotional support quality.

Peer Review & Critical Discussion

3 threads

Potential Selection Bias in 2023 Cohort

DSJDr. Sarah J.
Verified PhD Candidate
12 replies

The participant pool shows a concerning overrepresentation of users from high-income demographics. Looking at Table 3, we can see that 78% of respondents had annual incomes above $75k, which significantly limits the generalizability of these findings to broader populations.

2 hours ago

Non-naive Participants Issue

MCM. Chen (OpenAI)
Data Scientist
8 replies

I've noticed a methodological concern regarding participant naivety. Given that Prolific users often complete multiple studies, there's a real risk that participants had prior exposure to similar experimental paradigms, which could confound the results.

5 hours ago

RLHF Applicability to This Study Design

PRWProf. R. Williams
Verified Researcher
15 replies

The implications for RLHF training pipelines are understated. If we accept the authors' conclusions about preference stability, this has direct consequences for how we should structure reward model training. The temporal decay effect described in Section 4.2 is particularly relevant.

1 day ago

Verify your expertise to join discussion

Create an account and verify your credentials to participate in peer discussions.