Core Concepts
Multi-modal Knowledge Distillation with Prompt-Tuning enhances recommendation systems by bridging the semantic gap and reducing noise in multi-modal data.
Abstract
Multimedia platforms benefit from incorporating multi-modal content into recommender systems.
Challenges in multi-modal recommenders include overfitting and inaccuracies in side information.
PromptMM proposes a solution through Multi-modal Knowledge Distillation with prompt-tuning.
The framework compresses models, bridges the semantic gap, and adjusts for inaccuracies in multimedia data.
Experiments show the superiority of PromptMM over existing techniques.
Stats
"The output dimension of SBERT and CNNs are 768 and 4,096, respectively."
"PromptMM outperforms state-of-the-art baselines."
"The time complexity of R(·) is O(∑𝑚∈M|I|𝑑𝑚𝑑)."
Quotes
"Multimedia online platforms have greatly benefited from the incorporation of multimedia content into their personal recommender systems."
"PromptMM conducts model compression through distilling u-i edge relationship and multi-modal node content."
"Experiments on real-world data demonstrate PromptMM’s superiority over existing techniques."