This research paper introduces MORCELA (Magnitude-Optimized Regression for Controlling Effects on Linguistic Acceptability), a novel linking theory designed to enhance the correlation between language model (LM) probability scores and human judgments of sentence acceptability.
The authors argue that existing linking theories, such as the widely used SLOR (Syntactic Log-Odds Ratio), rely on fixed assumptions about the impact of sentence length and word frequency on LM probabilities, which may not hold true across different models, particularly larger, more performant ones.
MORCELA addresses this limitation by incorporating learnable parameters that adjust for these factors on a per-model basis. The researchers demonstrate MORCELA's effectiveness by evaluating its performance on two families of transformer LMs: Pythia and OPT. Their findings reveal that MORCELA consistently outperforms SLOR in predicting human acceptability judgments, with larger models exhibiting a more pronounced improvement.
Furthermore, the study's analysis of the learned parameters suggests that SLOR tends to overcorrect for length and frequency effects, especially in larger models. This overcorrection highlights the importance of model-specific adjustments when comparing LM probabilities to human judgments.
The authors also explore the relationship between a model's ability to predict infrequent words in context and its sensitivity to unigram frequency. They find that larger models, which generally exhibit a better understanding of context, are less affected by word frequency, suggesting a link between contextual understanding and robustness to frequency effects.
The paper concludes by emphasizing the need to consider model-specific characteristics when evaluating LM acceptability judgments and suggests that incorporating such considerations can lead to a more accurate assessment of LMs' alignment with human linguistic judgments.
Para Outro Idioma
do conteúdo original
arxiv.org
Principais Insights Extraídos De
by Lindia Tjuat... às arxiv.org 11-06-2024
https://arxiv.org/pdf/2411.02528.pdfPerguntas Mais Profundas