Online images amplify gender bias
Abstract
Each year, people spend less time reading and more time viewing images, which are proliferating online. Images from platforms such as Google and Wikipedia are downloaded by millions every day, and millions more are interacting through social media, such as Instagram and TikTok, that primarily consist of exchanging visual content. In parallel, news agencies and digital advertisers are increasingly capturing attention online through the use of images, which people process more quickly, implicitly and memorably than text. Here we show that the rise of images online significantly exacerbates gender bias, both in its statistical prevalence and its psychological impact. We examine the gender associations of 3,495 social categories (such as 'nurse' or 'banker') in more than one million images from Google, Wikipedia and Internet Movie Database (IMDb), and in billions of words from these platforms. We find that gender bias is consistently more prevalent in images than text for both female- and male-typed categories. We also show that the documented underrepresentation of women online is substantially worse in images than in text, public opinion and US census data. Finally, we conducted a nationally representative, preregistered experiment that shows that googling for images rather than textual descriptions of occupations amplifies gender bias in participants' beliefs. Addressing the societal effect of this large-scale shift towards visual communication will be essential for developing a fair and inclusive future for the internet.
Study specs
Analyzed 3,495 social categories using over one million images from platforms like Google, Wikipedia, and IMDb, compared visual content to billions of words from the same platforms, and conducted a preregistered national experiment to assess the psychological impact on participants' beliefs.
- Authors
- D Guilbeault,S Delecourt,T Hull,BS Desikan,M Chu
- Institution
- University of California Berkeley,Institute For Public Policy Research,Columbia University,University of Southern California Los Angeles
- Discipline
- Computational Social Science
- Sample Size
- N=3,495
- Study Type
- Experimental Study
- Year
- 2024
- Human Data Platform
- Prolific
- Source
- View Source DOI Google Scholar
Measured Outcomes
The prevalence and psychological impact of gender bias in online images compared to text, including gender associations and representation disparities.
Peer Review & Critical Discussion
Potential Selection Bias in 2023 Cohort
The participant pool shows a concerning overrepresentation of users from high-income demographics. Looking at Table 3, we can see that 78% of respondents had annual incomes above $75k, which significantly limits the generalizability of these findings to broader populations.
Non-naive Participants Issue
I've noticed a methodological concern regarding participant naivety. Given that Prolific users often complete multiple studies, there's a real risk that participants had prior exposure to similar experimental paradigms, which could confound the results.
RLHF Applicability to This Study Design
The implications for RLHF training pipelines are understated. If we accept the authors' conclusions about preference stability, this has direct consequences for how we should structure reward model training. The temporal decay effect described in Section 4.2 is particularly relevant.
Verify your expertise to join discussion
Create an account and verify your credentials to participate in peer discussions.