The authors introduce the task of pronoun use fidelity, which evaluates whether language models can correctly reuse a previously-specified pronoun in a later sentence, independent of potential distractors. They present a carefully-designed dataset of over 5 million instances to evaluate this task in English across 37 popular large language models.
The key findings are:
While models can mostly faithfully reuse previously-specified pronouns in the absence of distractors, they are significantly worse at processing she/her/her, singular they, and neopronouns.
Models are not robustly faithful to pronouns, as they are easily distracted. With even one additional sentence containing a distractor pronoun, accuracy drops on average by 34%. With 5 distractor sentences, accuracy drops by 52% for decoder-only models and 13% for encoder-only models.
Encoder-only models are better at faithfully reusing pronouns than decoder-only models of the same scale, and they are also more robust to the addition of distractor sentences. Most model errors are due to distraction, but with additional distractors, encoder-only models become less distracted while decoder-only models get even more distracted.
The authors conclude that widely-used large language models are unable to robustly and faithfully reason about pronouns in a simple setting that is easy for humans, and they encourage researchers in bias and reasoning to address these performance gaps.
לשפה אחרת
מתוכן המקור
arxiv.org
תובנות מפתח מזוקקות מ:
by Vagrant Gaut... ב- arxiv.org 04-05-2024
https://arxiv.org/pdf/2404.03134.pdfשאלות מעמיקות