Khái niệm cốt lõi
The proposed Multi-Margin Cosine Loss (MMCL) efficiently utilizes not only the hardest negative samples but also other non-trivial negative samples, offering a simpler yet effective loss function that outperforms more complex methods, especially in resource-constrained environments.
Tóm tắt
The article proposes a new loss function called Multi-Margin Cosine Loss (MMCL) for recommender systems (RS) that addresses the challenges of efficient contrastive learning.
Key highlights:
- Recommender systems typically consist of three main components: an interaction module, a loss function, and a negative sampling strategy. Recent research has shifted focus towards refining loss functions and negative sampling strategies.
- Contrastive learning, which pulls similar pairs closer while pushing dissimilar ones apart, has gained popularity in RS. However, it may bring challenges like high memory demands and under-utilization of some negative samples.
- MMCL addresses these challenges by introducing multiple margins and varying weights for negative samples. It efficiently utilizes not only the hardest negatives but also other non-trivial negatives, offering a simpler yet effective loss function.
- Experiments on two well-known datasets demonstrated that MMCL achieved up to a 20% performance improvement compared to a baseline loss function when fewer number of negative samples are used.
- MMCL performs comparably to state-of-the-art contrastive loss functions with a large negative sample size (e.g., 800) and achieves better results with smaller sample sizes (e.g., 10 or 100).
Thống kê
The article provides the following key statistics about the datasets used:
Yelp dataset:
#Users: 31,668
#Items: 38,048
#Interactions: 1,561,406
#Train: 1,237,259
#Test: 324,147
Density: 0.00130
Gowalla dataset:
#Users: 29,858
#Items: 40,981
#Interactions: 1,027,370
#Train: 810,128
#Test: 217,242
Density: 0.00084
Trích dẫn
"MMCL addresses these challenges by introducing multiple margins and varying weights for negative samples. It efficiently utilizes not only the hardest negatives but also other non-trivial negatives, offering a simpler yet effective loss function that outperforms more complex methods, especially when resources are limited."