แนวคิดหลัก
AlignRec proposes a solution to the misalignment issue in multimodal recommendations, integrating three alignments within its framework to improve performance.
บทคัดย่อ
The content discusses the challenges in multimodal recommendations, introduces the AlignRec framework, outlines its architecture design, alignment objectives, training strategies, and intermediate evaluation protocols. It also presents experimental results comparing AlignRec with baselines and analyzing the effectiveness of generated multimodal features.
1. Introduction
- Multimodal recommendations are crucial in modern applications.
- Existing methods face challenges due to misalignment issues.
- AlignRec offers a solution by integrating three alignments into its framework.
2. Alignment Objectives
- Inter-Content Alignment: Unifying vision and text modalities.
- Content-Category Alignment: Bridging gap between multimodal content features and ID-based features.
- User-Item Alignment: Aligning representations of users and items.
3. Training Strategies
- Pre-training on inter-content alignment followed by joint training on remaining tasks.
- Decoupling training process for better optimization.
4. Intermediate Evaluation Protocols
- Zero-Shot Recommendation: Evaluating user interests based on historical interactions.
- Item-CF Recommendation: Assessing recommendation using only multimodal features.
- Mask Modality Recommendation: Testing robustness in missing modality scenarios.
5. Experimental Results
- AlignRec outperforms baselines in top-K recommendation metrics across datasets.
- Generated multimodal features show effectiveness in zero-shot and item-CF evaluations.
สถิติ
In this paper, we first systematically investigate the misalignment issue in multi-
modal recommendations, and propose a solution named AlignRec.
Our extensive experiments on three real-world datasets consistently verify the superiority of AlignRec compared to nine baselines.