Centrala begrepp
The author highlights the need to bridge the gap between general-purpose text embeddings and specific demands of item retrieval tasks by proposing in-domain fine-tuning tasks, showcasing significant improvements in retrieval performance.
Sammanfattning
This paper addresses the limitations of general-purpose text embeddings for item retrieval tasks and proposes a solution through in-domain fine-tuning tasks. Experimental results demonstrate remarkable enhancements in retrieval performance across various tasks, emphasizing the importance of tailored representations for effective item retrieval.
Statistik
The Hit@5 metric for E5 on the US2I task increased dramatically from 0.0424 to 0.4723 after fine-tuning.
The total training data on Xbox is 120,000, with UH2I at 40,000, I2I at 20,000, and others around 7,000 each.
On Steam, the total training data is 200,000, with UH2I at 80,000, I2I at 40,000, and others around 10,000 each.
Coverage@K is used as a metric to evaluate the proportion of items meeting query conditions among the top-K items.
Citat
"In-domain fine-tuning is essential for enhancing item retrieval performance."
"Models exhibit poor OOD performance on tasks closely related to user behaviors."
"The refined model acts as a robust and versatile backbone for various item retrieval tasks."