Wang, Y., Ren, R., Li, J., Zhao, W. X., Liu, J., & Wen, J. (2024). REAR: A Relevance-Aware Retrieval-Augmented Framework for Open-Domain Question Answering. arXiv preprint arXiv:2402.17497.
This paper introduces REAR, a novel framework designed to address the challenge of irrelevant retrieved documents misleading Large Language Models (LLMs) in open-domain question answering tasks. The authors aim to enhance the self-awareness of LLMs regarding the reliability of retrieved documents, enabling them to effectively utilize external knowledge for accurate answer generation.
REAR incorporates a relevance-aware architecture that includes an assessment module within the LLM to evaluate the relevance of retrieved documents. This module generates relevance scores, guiding the LLM to prioritize reliable external evidence during answer generation. The authors propose two training strategies: bi-granularity relevance fusion, which integrates coarse and fine-grained relevance supervision, and noise-resistant training, which enhances the model's ability to discern and handle irrelevant content.
Experiments on four open-domain QA benchmarks (NQ, TriviaQA, WebQ, and SQuAD) demonstrate that REAR significantly outperforms existing RAG approaches, including RobustLM and Self-RAG. The ablation study highlights the contribution of each component, emphasizing the importance of the relevance assessment module, knowledge consistency verification, bi-granularity relevance fusion, and noise-resistant training.
REAR effectively enhances the self-awareness of LLMs in RAG systems, enabling them to accurately assess and utilize retrieved documents, leading to improved factual accuracy in open-domain question answering. The proposed framework demonstrates robustness to irrelevant content and variations in retriever capabilities.
This research significantly contributes to the field of open-domain question answering by addressing a critical challenge in RAG systems - the susceptibility of LLMs to misleading information from irrelevant retrieved documents. REAR's novel architecture and training methods offer a promising solution for improving the reliability and accuracy of knowledge-intensive NLP tasks.
While REAR demonstrates effectiveness at the document level, future research could explore finer-grained relevance assessment at the sentence or token level. Additionally, evaluating REAR's applicability across a wider range of RAG tasks, such as those in the KILT benchmark, would provide further insights into its generalizability.
На другой язык
из исходного контента
arxiv.org
Дополнительные вопросы