Conceitos essenciais
FiD models overfit to context quality during training, impacting performance in different context qualities.
Resumo
Abstract:
Retrieval-augmented generation models use external knowledge during generation.
Context quality affects FiD model training and performance in open-domain QA tasks.
Introduction:
Large-scale language models excel but struggle with hallucinations and new information.
Retrieval-augmented models address these challenges effectively.
Experimental Results:
FiD models overfit to context quality during training, leading to suboptimal performance in varied contexts.
Models show different cross-attention patterns based on training context quality.
Proposed Method:
Introducing bias to cross-attention distribution mitigates overfitting to specific context quality.
Conclusion:
Understanding how context characteristics impact model training is crucial for retrieval-augmented generation models.
Estatísticas
実験結果は、FiDモデルがトレーニング中にコンテキストの品質に過剰適合し、異なるコンテキスト品質でのパフォーマンスに影響を与えることを示しています。
モデルは、トレーニング中のコンテキスト品質に応じて異なるクロスアテンションパターンを示します。