Core Concepts
REPLIQA, a new question-answering dataset built from synthetic documents containing fictional scenarios, offers a more reliable evaluation of LLMs' reading comprehension and information retrieval abilities compared to existing benchmarks potentially contaminated by training data.
Monteiro, J., Noël, P., Marcotte, É., Rajeswar, S., Zantedeschi, V., Vázquez, D., Chapados, N., Pal, C., & Taslakian, P. (2024). REPLIQA: A Question-Answering Dataset for Benchmarking LLMs on Unseen Reference Content. arXiv preprint arXiv:2406.11811v2.
This paper introduces REPLIQA, a novel question-answering dataset designed to evaluate the ability of Large Language Models (LLMs) to comprehend and retrieve information from unseen reference documents, addressing the issue of data contamination in existing benchmarks.