TRAQ provides the first end-to-end statistical correctness guarantee for retrieval augmented question answering by leveraging conformal prediction.
The core message of this paper is that incorporating multi-granularity evidence, including both passage-level and sentence-level information, can significantly improve the performance and efficiency of open-domain question answering systems.
Incorporating a small encoder model to effectively encode longer contexts and leverage cross-attention mechanism to improve the performance of open-domain question answering.
Retrieval-augmented open-domain question answering models face challenges in generalizing to updated knowledge corpora or unseen domains due to the reader's tendency to over-memorize retrieved contexts. Corpus-Invariant Tuning (CIT) is proposed to mitigate this issue by controlling the likelihood of retrieved documents during training, leading to improved generalization across different corpora and domains.