The paper investigates the connection between large language model (LLM) misgendering of non-binary pronouns (neopronouns) and the tokenization process used by these models. The authors discover that Byte-Pair Encoding (BPE), a widely adopted subword tokenization technique, overfragments neopronouns compared to binary pronouns due to the infrequency of neopronouns in the training data.
The authors first establish a link between LLM misgendering and poor neopronoun grammatical proficiency. They introduce three evaluation metrics - pronoun consistency, pronoun case error, and adversarial injection error - to quantify an LLM's understanding of different pronoun forms. The results show a strong negative correlation between misgendering and grammatical errors, suggesting that enhancing an LLM's neopronoun morphosyntax could mitigate its tendency to misgender.
To address this issue, the authors propose two techniques: 1) Pronoun Tokenization Parity (PTP), which enforces consistent tokenization across gendered pronouns, and 2) leveraging pre-existing LLM pronoun knowledge to improve neopronoun proficiency through lexical layer finetuning. Experiments across different model sizes demonstrate that these methods significantly outperform standard finetuning, improving neopronoun accuracy from 14.1% to 58.4%. Notably, lexical finetuning with PTP consistently improves pronoun consistency across model sizes, with smaller models experiencing the most significant gains.
The paper highlights that the observed tokenization disparities, a consequence of data scarcity, are a key contributor to LLM misgendering of underrepresented pronouns. The proposed solutions provide a promising path forward for developing more inclusive and grammatically proficient language models.
To Another Language
from source content
arxiv.org
Thông tin chi tiết chính được chắt lọc từ
by Anaelia Oval... lúc arxiv.org 04-09-2024
https://arxiv.org/pdf/2312.11779.pdfYêu cầu sâu hơn