The paper proposes IterCQR, a methodology for iteratively training a conversational query reformulation (CQR) model without relying on human-annotated rewrites. The key insights are:
IterCQR initializes the CQR model using queries rewritten by a large language model (LLM), then iteratively trains the model by generating candidate queries and optimizing them using retrieval signals as a reward.
The iterative training process consists of two steps: exploration using Minimum Bayes Risk (MBR) training, and exploitation using Top-1 candidate selection. The MBR training leverages the cosine similarity between candidate queries and ground-truth passages as the reward, guiding the model to generate retriever-friendly queries.
IterCQR achieves state-of-the-art performance on two widely used conversational search datasets, TopiOCQA and QReCC, outperforming strong baselines that rely on human-annotated rewrites.
The paper also demonstrates IterCQR's superior performance in challenging scenarios, such as generalization to unseen datasets and low-resource settings, without requiring additional human annotations.
Through quantitative analysis, the authors show that as the iterations progress, IterCQR generates queries that increasingly summarize the previous dialogue context, leading to improved retrieval performance.
Sang ngôn ngữ khác
từ nội dung nguồn
arxiv.org
Thông tin chi tiết chính được chắt lọc từ
by Yunah Jang,K... lúc arxiv.org 04-09-2024
https://arxiv.org/pdf/2311.09820.pdfYêu cầu sâu hơn