The paper explores different strategies for integrating retrieved passages with large language models (LLMs) to enhance open-domain question answering. The authors first examine the limitations of a commonly used concatenation approach, which often results in "unknown" outputs even when the correct document is among the top-k retrieved passages.
To address this issue, the authors explore four alternative strategies:
Two single-round methods that utilize chain-of-thought reasoning:
Two multi-round approaches that incorporate feedback loops:
Through comprehensive experiments on three open-domain question-answering datasets (NQ, TriviaQA, and SQuAD), the authors demonstrate that their multi-round approaches outperform the traditional concatenation approach, achieving over a 10% improvement in answer exact match (EM).
The paper also provides insights into the optimal placement of the gold passage within the top-k retrieved passages, the effect of different decoding strategies, and the token usage analysis of the proposed methods.
A otro idioma
del contenido fuente
arxiv.org
Ideas clave extraídas de
by Ye Liu,Semih... a las arxiv.org 04-09-2024
https://arxiv.org/pdf/2308.12574.pdfConsultas más profundas