This paper examines the extent to which established findings on RLT for retrieval are generalizable to the "retrieve-then-re-rank" setup, where the goal is to optimize the trade-offs between effectiveness and efficiency in re-ranking.
The key insights are:
Supervised RLT methods do not show a clear advantage over using a fixed re-ranking depth; potential fixed re-ranking depths can closely approximate the effectiveness/efficiency trade-offs achieved by supervised RLT methods.
The choice of retriever has a substantial impact on RLT for re-ranking: with an effective retriever like SPLADE++ or RepLLaMA, a fixed re-ranking depth of 20 can already yield an excellent effectiveness/efficiency trade-off.
Supervised RLT methods tend to fail to predict when not to carry out re-ranking and seem to suffer from a lack of training data.
The authors reproduce a comprehensive set of RLT methods and conduct extensive experiments on 2 datasets, 8 RLT methods, and pipelines involving 3 retrievers and 2 re-rankers. The findings provide a comprehensive understanding of how RLT methods generalize to the new "retrieve-then-re-rank" perspective.
Sang ngôn ngữ khác
từ nội dung nguồn
arxiv.org
Thông tin chi tiết chính được chắt lọc từ
by Chuan Meng,N... lúc arxiv.org 04-30-2024
https://arxiv.org/pdf/2404.18185.pdfYêu cầu sâu hơn