Impact of AI-Assisted Diagnosis on American Patients' Trust in and Intention to Seek Help From Health Care Professionals: Randomized, Web-Based Survey ...

4 citations

Abstract

Background: Artificial intelligence (AI) technologies are increasingly integrated into medical practice, with AI-assisted diagnosis showing promise. However, patient acceptance of AI-assisted diagnosis, compared with human-only procedures, remains understudied, especially in the wake of generative AI advancements such as ChatGPT. Objective: This research examines patient preferences for doctors using AI assistance versus those relying solely on human expertise. It also studies demographic, social, and experiential factors influencing these preferences. Methods: We conducted a preregistered 4-group randomized survey experiment among a national sample representative of the US population on several demographic benchmarks (n=1762). Participants viewed identical doctor profiles, with varying AI usage descriptions: no AI mention (control, n=421), explicit nonuse (No AI, n=435), moderate use (Moderate AI, n=481), and extensive use (Extensive AI, n=425). Respondents reported their tendency to seek help, trust in the doctor as a person and a professional, knowledge of AI, frequency of using AI in their daily lives, demographics, and partisan identification. We analyzed the results with ordinary least squares regression (controlling for sociodemographic factors), mediation analysis, and moderation analysis. We also explored the moderating effect of past AI experiences on the tendency to seek help and trust in the doctor. Results: Mentioning that the doctor uses AI to assist in diagnosis consistently decreased trust and intention to seek help. Trust and intention to seek help (measured with a 5-point Likert scale and coded as 0‐1 with equal intervals in between) were highest when AI was explicitly absent (control group: mean 0.50; No AI group: mean 0.63) and lowest when AI was extensively used (Extensive AI group: mean 0.30; Moderate AI group: mean 0.34). A linear regression controlling for demographics suggested that the negative effect of AI assistance was significant with a large effect size (β=−.45, 95% CI −0.49 to −0.40, *t* ~1740~=−20.81; *P* <.001). This pattern was consistent for trust in the doctor as a person (β=−.33, 95% CI −0.37 to −0.28, *t* ~1733~=−14.41; *P* <.001) and as a professional (β=−.40, 95% CI −0.45 to −0.36, *t* ~1735~=−18.54; *P* <.001). Results were consistent across age, gender, education, and partisanship, indicating a broad aversion to AI-assisted diagnosis. Moderation analyses suggested that the "AI trust gap" shrank as AI use frequency increased (interaction term: β=.09, 95% CI 0.04-0.13, *t* ~1735~=4.06; *P* <.001) but expanded as self-reported knowledge increased (interaction term: β=−.04, 95% CI −0.08 to 0.00, *t* ~1736~=−1.75; *P* =.08). Conclusions: Despite AI's growing role in medicine, patients still prefer human-only expertise, regardless of partisanship and demographics, underscoring the need for strategies to build trust in AI technologies in health care. Trial Registration: OSF Registries osf.io/5vcdg; <https://osf.io/5vcdg>

4
Citations
Survey
Paper Only

Study specs

A randomized, web-based 4-group survey experiment was conducted with controls for sociodemographic factors and analysis using regression, mediation, and moderation techniques.

Authors
C Chen,Z Cui
Sample Size
N=1,762
Study Type
Survey Research
Year
2025
Human Data Platform
Prolific

Measured Outcomes

Trust in and intention to seek medical help from health care professionals using AI-assisted diagnosis versus those avoiding AI, and the influence of demographic, social, and experiential factors.

Peer Review & Critical Discussion

3 threads

Potential Selection Bias in 2023 Cohort

DSJDr. Sarah J.
Verified PhD Candidate
12 replies

The participant pool shows a concerning overrepresentation of users from high-income demographics. Looking at Table 3, we can see that 78% of respondents had annual incomes above $75k, which significantly limits the generalizability of these findings to broader populations.

2 hours ago

Non-naive Participants Issue

MCM. Chen (OpenAI)
Data Scientist
8 replies

I've noticed a methodological concern regarding participant naivety. Given that Prolific users often complete multiple studies, there's a real risk that participants had prior exposure to similar experimental paradigms, which could confound the results.

5 hours ago

RLHF Applicability to This Study Design

PRWProf. R. Williams
Verified Researcher
15 replies

The implications for RLHF training pipelines are understated. If we accept the authors' conclusions about preference stability, this has direct consequences for how we should structure reward model training. The temporal decay effect described in Section 4.2 is particularly relevant.

1 day ago

Verify your expertise to join discussion

Create an account and verify your credentials to participate in peer discussions.