Concepts de base
The proposed Multi-Margin Cosine Loss (MMCL) efficiently utilizes not only the hardest negative samples but also other non-trivial negative samples, offering a simpler yet effective loss function that outperforms more complex methods, especially in resource-constrained environments.
Résumé
The article proposes a new loss function called Multi-Margin Cosine Loss (MMCL) for recommender systems (RS) that addresses the challenges of efficient contrastive learning.
Key highlights:
- Recommender systems typically consist of three main components: an interaction module, a loss function, and a negative sampling strategy. Recent research has shifted focus towards refining loss functions and negative sampling strategies.
- Contrastive learning, which pulls similar pairs closer while pushing dissimilar ones apart, has gained popularity in RS. However, it may bring challenges like high memory demands and under-utilization of some negative samples.
- MMCL addresses these challenges by introducing multiple margins and varying weights for negative samples. It efficiently utilizes not only the hardest negatives but also other non-trivial negatives, offering a simpler yet effective loss function.
- Experiments on two well-known datasets demonstrated that MMCL achieved up to a 20% performance improvement compared to a baseline loss function when fewer number of negative samples are used.
- MMCL performs comparably to state-of-the-art contrastive loss functions with a large negative sample size (e.g., 800) and achieves better results with smaller sample sizes (e.g., 10 or 100).
Stats
The article provides the following key statistics about the datasets used:
Yelp dataset:
#Users: 31,668
#Items: 38,048
#Interactions: 1,561,406
#Train: 1,237,259
#Test: 324,147
Density: 0.00130
Gowalla dataset:
#Users: 29,858
#Items: 40,981
#Interactions: 1,027,370
#Train: 810,128
#Test: 217,242
Density: 0.00084
Citations
"MMCL addresses these challenges by introducing multiple margins and varying weights for negative samples. It efficiently utilizes not only the hardest negatives but also other non-trivial negatives, offering a simpler yet effective loss function that outperforms more complex methods, especially when resources are limited."