Core Concepts
Disentangling the distinct effects of item ID co-occurrence patterns and fine-grained item modality preferences to improve the accuracy and interpretability of session-based recommendation.
Abstract
The paper proposes a novel framework called DIMO (Disentangling ID and MOdality effects) to disentangle the effects of item ID and modality in session-based recommendation.
At the item level, DIMO introduces a co-occurrence representation schema to explicitly incorporate co-occurrence patterns into ID embeddings, and aligns different modalities (text, images) into a unified semantic space.
At the session level, DIMO presents a multi-view self-supervised disentanglement approach, including a proxy mechanism and counterfactual inference, to distinguish ID and modality effects without supervised signals.
Leveraging the disentangled causes, DIMO provides recommendations via causal inference and generates two types of explanations: co-occurrence template and feature template.
Extensive experiments on multiple real-world datasets demonstrate DIMO's consistent superiority over existing state-of-the-art methods in both accuracy and interpretability.
Stats
There are (κ) people frequently buying item x_i and recommended item x_m+1 together.
The recommended item x_m+1 also possesses feature (f2) similar to the feature (f1) of the item x_i you have bought.
Quotes
"Disentangling ID and modality effects is challenging due to the absence of supervised signals that indicate which factor dominates user choice within a session."
"Failing to distinguish rationales of user actions, existing methods fail to generate convincing explanations."