Whose view of safety? a deep dive dataset for pluralistic alignment of text-to-image models
Authors: C Rastogi, TH Teh, P Mishra, R Patel, D Wang, M Díaz, A Parrish, AM Davani, Z Ashwood
Published: 2025
Publication: arXiv preprint arXiv:2507.13383, 2025•arxiv.org
This research introduces the DIVE dataset to enable pluralistic alignment in text-to-image models by accounting for diverse safety perspectives, revealing demographic variations in harm perception and advancing T2I model alignment strategies.
Methods: The study involved collecting feedback across 1000 prompts from demographically intersectional human raters to capture diverse safety perspectives, with an emphasis on empirical and contextual differences in harm perception.
Key Findings: Safety perceptions of text-to-image (T2I) model outputs from diverse demographic viewpoints and the influence of these perspectives on alignment strategies.
Limitations: Content included may be sensitive or harmful, potentially impacting the reproducibility and accessibility of the dataset; broader scalability of findings beyond the evaluated demographic groups is not clear.
Institution: Google DeepMind, Google Research, Google
Research Area: AI alignment, safety evaluation, AI Safety, Multimodal evaluation, Human–AI interaction, LLM
Discipline: Computer Science, Machine Learning, Artificial Intelligence
Sample Size: 1000 participants
Citations: 1