Adaptive Memory Replay for Efficient Continual Pre-Training of Foundation Models
An adaptive memory replay approach that dynamically selects past data samples to minimize forgetting while maintaining computational efficiency during continual pre-training of large-scale foundation models.