Core Concepts
Utilizing statistics-driven pre-training tasks to reduce the impact of random noise in user action sequences and stabilize the optimization of sequential recommendation models.
Abstract
The paper proposes the StatisTics-Driven Pre-training (STDP) framework to address the challenge of random noise in user action sequences, which can disrupt the optimization of sequential recommendation models.
Key highlights:
The authors reveal that inevitable random actions in user sequences, such as randomly accessing items or clicking items in random order, can lead to unstable supervision signals for the model.
To alleviate this issue, the STDP framework leverages statistics-driven pre-training tasks to stabilize the model optimization:
Co-occurred Items Prediction (CIP): Encourages the model to distribute its attention on multiple suitable targets instead of just focusing on the next item.
Paired Sequence Similarity (PSS): Enhances the model's robustness to random noise by maximizing the similarity between the original sequence and a paired sequence with randomly replaced items.
Frequent Attribute Prediction (FAP): Facilitates the model in capturing stable user long-term preferences by predicting the frequently appearing attributes in the sequence.
Extensive experiments on six datasets demonstrate the effectiveness of the proposed STDP framework, which outperforms state-of-the-art methods by a significant margin.
Further analysis verifies the generalization of the STDP framework by applying it to improve the performance of the GRU4Rec model.
Stats
The average sequence length across the six datasets ranges from 8.3 to 54.9 items per sequence.
The number of unique items in the datasets varies from 3,646 to 20,062.
The number of attributes per item ranges from 3.7 to 31.5.