Back to the Communities: A Mixed-Methods and Community-Driven Evaluation of Cultural Sensitivity in Text-to-Image Models
Abstract
Evidence shows that text-to-image (T2I) models disproportionately reflect Western cultural norms, amplifying misrepresentation and harms to minority groups. However, evaluating cultural sensitivity is inherently complex due to its fluid and multifaceted nature. This paper draws on a state-of-the-art review and co-creation workshops involving 59 individuals from 19 different countries. We developed and validated a mixed-methods community-based evaluation methodology to assess cultural sensitivity in T2I models, which embraces first-person methods. Quantitative scores and qualitative inquiries expose convergence and disagreement within and across communities, illuminate the downstream consequences of misrepresentation, and trace how training data shaped by unequal power relations distort depictions. Extensive assessments are constrained by high resource requirements and the dynamic nature of culture, a tension we alleviate through a context-based and iterative methodology. The paper provides actionable recommendations for stakeholders, highlighting pathways to investigate the sources, mechanisms, and impacts of cultural (mis)representation in T2I models.