Core Concepts
Fusion-in-T5 (FiT5) is a unified document ranking model that integrates text matching, ranking features, and global document information through attention fusion, outperforming complex cascade ranking pipelines.
Abstract
The paper introduces Fusion-in-T5 (FiT5), a novel document ranking model that consolidates multiple ranking signals into a single, unified framework. FiT5 is based on the T5 encoder-decoder architecture and employs an attention fusion mechanism to effectively integrate text matching, ranking features, and global document information.
Key highlights:
- FiT5 packs the query, document, and ranking features into a templated input, allowing the model to jointly leverage these signals.
- FiT5 introduces global attention layers in the late stages of the encoder, enabling the model to make comprehensive decisions by considering information across top-ranked documents.
- Experiments on MS MARCO and TREC Deep Learning benchmarks show that FiT5 significantly outperforms traditional cascade ranking pipelines, including complex multi-stage systems.
- Analysis reveals that the attention fusion mechanism allows FiT5 to better differentiate between similar documents and produce more accurate rankings.
Stats
FiT5 outperforms the first-stage retrieval model coCondenser by a large margin on both MS MARCO and TREC DL.
On MS MARCO, FiT5 further outperforms the multi-stage ranking pipeline Expando-Mono-Duo, which uses significantly larger models.
Compared to monoT5, FiT5 exhibits a mere 4.5% increase in GPU memory usage and only a marginal increase in inference time, demonstrating its efficiency.
Quotes
"FiT5 exhibits substantial improvements over traditional re-ranking pipelines."
"Analysis reveals that the attention fusion mechanism allows FiT5 to better differentiate between similar documents and produce more accurate rankings."