Core Concepts
The core message of this paper is to propose a Query-bag Pseudo Relevance Feedback (QB-PRF) framework that can effectively enhance the performance of information-seeking conversation systems. The framework includes a Query-bag Selection Module (QBS) to retrieve and select relevant queries to form a query-bag, and a Query-bag Fusion Module (QBF) to fuse the query-bag information with the original query to improve the query representation.
Abstract
This paper proposes a Query-bag Pseudo Relevance Feedback (QB-PRF) framework to enhance the performance of information-seeking conversation systems. The key components are:
Representation Learning: The authors leverage unsupervised pre-training methods, specifically Variational Auto-Encoder (VAE), to obtain distinctive sentence representations that can serve as supervision signals for the QBS module.
Query-bag Selection (QBS) Module: The QBS module employs contrastive learning to select relevant queries from the unlabeled corpus to form the query-bag, leveraging the representations learned from the pre-trained VAE.
Query-bag Fusion (QBF) Module: The QBF module utilizes a transformer-based network to fuse the mutual information between the original query and the selected query-bag, generating a refined query representation.
Matching Model: The refined query representation from the QBF module is then fed into any downstream matching model to enhance its performance on information-seeking conversation tasks.
The authors verify the effectiveness of the QB-PRF framework on two competitive pre-trained backbone models, BERT and GPT-2, across two benchmark datasets (Quora and LaiYe). The experimental results demonstrate that the proposed framework significantly outperforms strong baselines, highlighting the importance of leveraging query-bag information to improve query representation and overall system performance.
Stats
The authors report the following key statistics:
The LaiYe dataset has 425,310 training, 40,000 validation, and 40,000 test queries, with an average of 11.20 queries per query-bag.
The Quora dataset has 56,294 training, 5,536 validation, and 10,000 test queries, with an average of 8.43 queries per query-bag.