Główne pojęcia
Retrieval-augmented encoder-decoder language models can significantly improve in-context learning performance through a combination of retrieval-augmented masked language modeling, retrieval-augmented prefix language modeling, and Fusion-in-Context Learning.
Streszczenie
The paper investigates the in-context learning ability of retrieval-augmented encoder-decoder language models, which combine a retriever and an encoder-decoder reader. The authors first conduct a comprehensive analysis of existing models and identify their limitations, such as a mismatch between pretraining and inference, as well as a restricted context length.
To address these issues, the authors propose RAVEN, a model that combines retrieval-augmented masked language modeling and retrieval-augmented prefix language modeling. They further introduce Fusion-in-Context Learning, which enables the model to leverage more in-context examples without requiring additional training. The authors also utilize the retriever of RAVEN to retrieve relevant in-context examples, further enhancing the few-shot performance.
Through extensive experiments on open-domain question answering and language understanding tasks, the authors demonstrate that RAVEN significantly outperforms previous retrieval-augmented encoder-decoder models, achieving results comparable to the most advanced language models in certain scenarios, despite having substantially fewer parameters.
The key highlights and insights from the paper are:
- Retrieval-augmented encoder-decoder language models exhibit a certain in-context learning ability, but their performance is limited by a mismatch between pretraining and inference, as well as a restricted context length.
- RAVEN, the proposed model, combines retrieval-augmented masked language modeling and retrieval-augmented prefix language modeling to mitigate the pretraining-inference mismatch.
- Fusion-in-Context Learning enables RAVEN to effectively utilize more in-context examples during inference, without requiring additional training.
- Integrating the retriever of RAVEN to retrieve relevant in-context examples further enhances the few-shot performance.
- RAVEN significantly outperforms previous retrieval-augmented encoder-decoder models and achieves results comparable to the most advanced language models, despite having substantially fewer parameters.
Statystyki
"Machine learning models require a high quantity of reliable data in order for the models to be effective."
"first permanent commercial bungee site, the Kawarau Bridge Bungy at the Kawarau Gorge Suspension Bridge near Queenstown in the South Island of New Zealand"
Cytaty
"In this paper, we investigate the in-context learning ability of retrieval-augmented encoder-decoder language models."
"To address these issues, we propose RAVEN, a model that combines retrieval-augmented masked language modeling and prefix language modeling."
"We further introduce Fusion-in-Context Learning to enhance the few-shot performance by enabling the model to leverage more in-context examples without requiring additional training."