The authors introduce the task of pronoun use fidelity, which evaluates whether language models can correctly reuse a previously-specified pronoun in a later sentence, independent of potential distractors. They present a carefully-designed dataset of over 5 million instances to evaluate this task in English across 37 popular large language models.
The key findings are:
While models can mostly faithfully reuse previously-specified pronouns in the absence of distractors, they are significantly worse at processing she/her/her, singular they, and neopronouns.
Models are not robustly faithful to pronouns, as they are easily distracted. With even one additional sentence containing a distractor pronoun, accuracy drops on average by 34%. With 5 distractor sentences, accuracy drops by 52% for decoder-only models and 13% for encoder-only models.
Encoder-only models are better at faithfully reusing pronouns than decoder-only models of the same scale, and they are also more robust to the addition of distractor sentences. Most model errors are due to distraction, but with additional distractors, encoder-only models become less distracted while decoder-only models get even more distracted.
The authors conclude that widely-used large language models are unable to robustly and faithfully reason about pronouns in a simple setting that is easy for humans, and they encourage researchers in bias and reasoning to address these performance gaps.
Naar een andere taal
vanuit de broninhoud
arxiv.org
Belangrijkste Inzichten Gedestilleerd Uit
by Vagrant Gaut... om arxiv.org 04-05-2024
https://arxiv.org/pdf/2404.03134.pdfDiepere vragen