FlyKD proposes a method to address the limitations of traditional Knowledge Distillation (KD) by generating an extensive number of pseudo labels dynamically. By incorporating Curriculum Learning, FlyKD enhances the optimization process over noisy pseudo labels. The paper highlights the challenges in training student models with noisy pseudo labels generated by teacher models and emphasizes the importance of stabilizing this process. Through empirical observations, FlyKD outperforms vanilla KD and LSPGCN, showcasing its effectiveness in link prediction tasks. The integration of Curriculum Learning sheds light on a new research direction for optimizing student model training over noisy pseudo labels.
To Another Language
from source content
arxiv.org
Дополнительные вопросы