The paper investigates the connection between large language model (LLM) misgendering of non-binary pronouns (neopronouns) and the tokenization process used by these models. The authors discover that Byte-Pair Encoding (BPE), a widely adopted subword tokenization technique, overfragments neopronouns compared to binary pronouns due to the infrequency of neopronouns in the training data.
The authors first establish a link between LLM misgendering and poor neopronoun grammatical proficiency. They introduce three evaluation metrics - pronoun consistency, pronoun case error, and adversarial injection error - to quantify an LLM's understanding of different pronoun forms. The results show a strong negative correlation between misgendering and grammatical errors, suggesting that enhancing an LLM's neopronoun morphosyntax could mitigate its tendency to misgender.
To address this issue, the authors propose two techniques: 1) Pronoun Tokenization Parity (PTP), which enforces consistent tokenization across gendered pronouns, and 2) leveraging pre-existing LLM pronoun knowledge to improve neopronoun proficiency through lexical layer finetuning. Experiments across different model sizes demonstrate that these methods significantly outperform standard finetuning, improving neopronoun accuracy from 14.1% to 58.4%. Notably, lexical finetuning with PTP consistently improves pronoun consistency across model sizes, with smaller models experiencing the most significant gains.
The paper highlights that the observed tokenization disparities, a consequence of data scarcity, are a key contributor to LLM misgendering of underrepresented pronouns. The proposed solutions provide a promising path forward for developing more inclusive and grammatically proficient language models.
To Another Language
from source content
arxiv.org
Viktige innsikter hentet fra
by Anaelia Oval... klokken arxiv.org 04-09-2024
https://arxiv.org/pdf/2312.11779.pdfDypere Spørsmål