Dawkins, H., Nejadgholi, I., & Lo, C. (2024). WMT24 Test Suite: Gender Resolution in Speaker-Listener Dialogue Roles. arXiv preprint arXiv:2411.06194v1.
This research paper introduces a new test suite for evaluating the ability of machine translation systems to accurately resolve gender in literary-style dialogue, particularly examining the influence of gender stereotypes on translation accuracy.
The authors developed a test suite with English source text containing dialogues with embedded gender cues, including stereotyped character descriptions and manners of speaking. The test suite was translated into three target languages with grammatical gender (Spanish, Czech, and Icelandic). The accuracy of gender agreement in adjective translations was analyzed, considering factors like stereotype presence, referent role, and structural elements of the dialogue.
The study found that gender stereotypes in character descriptions and speaking styles significantly influence the gender assigned to speakers in the translated text, often overriding explicit gender information. This bias was observed across different translation systems and target languages. Additionally, a tendency towards assuming either same-gender or opposite-gender speaker pairs was identified, impacting accuracy when these assumptions were challenged.
The research highlights the vulnerability of machine translation systems to gender bias stemming from societal stereotypes, impacting translation accuracy even in the presence of clear gender markers. This underscores the need for developing more robust models that can mitigate the influence of stereotypes and improve gender resolution in translated dialogue.
This research contributes a valuable tool for evaluating and improving the fairness and accuracy of machine translation systems, particularly in handling gender in complex linguistic contexts like dialogue. It emphasizes the importance of addressing gender bias in NLP applications to ensure equitable and inclusive technology.
The study primarily focuses on binary gender, limiting its generalizability to non-binary individuals. Future research should explore the translation of non-binary gender identities and investigate strategies for promoting gender-neutral translations. Additionally, expanding the test suite beyond simplified templates to include real-world literary dialogues would enhance the ecological validity of the findings.
翻译成其他语言
从原文生成
arxiv.org
更深入的查询