Better Be Computer or I'm Dumb": A Large-Scale Evaluation of Humans as Audio Deepfake Detectors

Authors: K Warren, T Tucker, A Crowder, D Olszewski

Published: 2024

Publication: Proceedings of the ..., 2024 - dl.acm.org

Humans outperform machine learning models in classifying real human audio versus deepfakes, but are often misled by preconceptions about generated content, highlighting the need for more synergistic approaches between human and machine decision-making.

Methods: A large-scale user study was conducted where over 1,200 participants evaluated audio samples from three widely-cited deepfake datasets. Performance was quantitatively measured and thematic analysis was used to explore user reasoning and differences from machine classification.

Key Findings: Comparison of human and machine classification performance on audio deepfake detection, analysis of user reasoning, and evaluation of error patterns between both humans and models.

Limitations: The study is limited by potential biases in user preconceptions about deepfake characteristics and the reliance on three specific deepfake datasets, which may not fully represent the diversity of real-world audio deepfakes.

Institution: University of Florida

Research Area: Audio Deepfake Detection, Human Factors in AI Security, Perceptual Studies, AI Security

Discipline: Computer Science

Sample Size: 1200 participants

Citations: 14

DOI: https://doi.org/10.1145/3658644.3670325