Jacob, M., Lindgren, E., Zaharia, M., Carbin, M., Khattab, O., & Drozdov, A. (2024). Drowning in Documents: Consequences of Scaling Reranker Inference. arXiv preprint arXiv:2411.11767.
This research paper investigates the widely held assumption that using more computationally expensive rerankers in information retrieval (IR) systems will consistently improve the quality of retrieved documents, especially when scaling the number of documents scored.
The authors evaluate the performance of various state-of-the-art open-source and proprietary rerankers on eight different academic and enterprise IR benchmarks. They measure the recall of these rerankers when tasked with scoring an increasing number of documents (k) retrieved by different first-stage retrieval methods. Additionally, they compare the performance of rerankers against standalone retrievers in a full-dataset retrieval scenario.
The research challenges the prevailing understanding of reranker effectiveness in IR systems. The authors argue that current pointwise cross-encoder rerankers are not as robust as commonly believed, particularly when scoring a large number of documents. They suggest that factors like limited exposure to negative examples during training and inherent limitations in deep learning robustness might contribute to this performance degradation.
This study highlights a critical gap in the current understanding and application of rerankers in IR systems. The findings have significant implications for practitioners who rely on rerankers to improve retrieval accuracy, urging them to carefully consider the trade-off between reranker complexity and the number of documents scored.
The study primarily focuses on pointwise cross-encoder rerankers and acknowledges the limitations posed by closed-source models. Future research could explore the impact of different training strategies, data distributions, and model sizes on reranker robustness. Further investigation into the potential of LLMs for listwise reranking and their application as teacher models for improving cross-encoder robustness is also warranted.
На другой язык
из исходного контента
arxiv.org
Дополнительные вопросы