Core Concepts
Individuals' perceptions of offensive language are shaped by their diverse moral values and cultural backgrounds, leading to substantial disagreements that need to be accounted for in the development of fair and inclusive language technologies.
Abstract
This paper introduces the D3CODE dataset, a large-scale cross-cultural dataset of parallel annotations for offensive language in over 4.5K sentences. The dataset was annotated by a pool of over 4k annotators, balanced across gender and age, from across 21 countries representing eight geo-cultural regions. The dataset also captures annotators' moral values along six dimensions: care, equality, proportionality, authority, loyalty, and purity.
The analyses reveal substantial regional variations in annotators' perceptions of offensiveness, which are shaped by their individual moral values. Annotators from certain regions, such as Oceania, North America, and Western Europe, were more likely to express uncertainty about understanding the annotation items compared to those from other regions like Indian Cultural Sphere, Arab Culture, and Sub-Saharan Africa.
The study also found that items mentioning specific social identity groups evoked the highest levels of disagreement among annotators, significantly more than items with moral sentiment or randomly selected items. This underscores the need to account for cultural and individual differences in perceptions of offensive language, beyond just demographic variations, in order to build fair and inclusive language technologies.
The findings highlight the importance of incorporating diverse perspectives and moral considerations into the development and evaluation of language models, moving beyond a singular notion of offensiveness. The D3CODE dataset provides a valuable resource for assessing modeling approaches that can capture the nuanced and subjective nature of language perception across cultures.
Stats
Annotators from China, Brazil, and Egypt provided significantly different labels on the offensiveness of the content.
Annotators aged 50 and above were more likely to state that they did not understand the annotation items compared to younger age groups.
Items mentioning specific social identity groups evoked the highest levels of disagreement among annotators, significantly more than items with moral sentiment or randomly selected items.
Quotes
"Perceiving language as offensive can depend inherently on one's moral judgments as well as the social norms dictated by the socio-cultural context within which one's assessments are made."
"Individuals might systematically disagree on notions of offensiveness, reflecting the complexity of beliefs and values that shape their perspectives and judgments within any given cultural context."
"Acknowledging and accounting for the diversity of moral judgments and values across different cultures and demographics is crucial for enhancing the fairness and inclusivity of language technologies."