แนวคิดหลัก
Slipstream, a software framework, identifies stale embeddings during training and skips their updates to enhance performance, achieving substantial speedup and optimizing CPU-GPU bandwidth usage.
บทคัดย่อ
The paper presents Slipstream, a software framework that optimizes the training of deep learning recommendation models by identifying and skipping the updates to stale embeddings.
The key insights are:
- Recommendation models like DLRM have large embedding tables that are memory-intensive and a significant portion of the training time.
- Within the hot embeddings (frequently accessed), some embeddings exhibit rapid training and minimal subsequent variation, resulting in saturation.
- Slipstream leverages this observation and employs three key components:
- Snapshot Block: Periodically captures snapshots of the hot embeddings to track their training dynamics.
- Sampling Block: Efficiently estimates an optimal threshold to identify stale embeddings by sampling a subset of hot inputs.
- Input Classifier Block: Selectively filters inputs accessing stale embeddings and trains only on the varying embeddings.
Slipstream achieves substantial speedups of 2×, 2.4×, 1.2×, and 1.175× across real-world datasets and configurations, compared to baselines, while maintaining high accuracy.
สถิติ
Embedding tables in real-world datasets can reach sizes of hundreds of gigabytes.
A small subset of 'hot' embeddings (frequently accessed) can receive over 100x more access than others.
Certain 'hot' embeddings can plateau and exhibit minimal updates in magnitude after certain stages of training.
คำพูด
"Training recommendation models pose significant challenges regarding resource utilization and performance."
"Slipstream optimizes training efficiency by selectively updating embedding values based on data awareness."
"Slipstream achieves substantial speedups of 2×, 2.4×, 1.2×, and 1.175× across real-world datasets and configurations, compared to baselines, while maintaining high accuracy."