Efficient and Robust Fine-Tuning of Pre-Trained Language Models by Transferring Training Dynamics
Training dynamics are highly transferable across model sizes and pre-training methods, enabling efficient and robust fine-tuning of pre-trained language models.