Core Concepts
FlyKD enables unlimited pseudo label generation with Curriculum Learning for improved optimization over noisy labels.
Abstract
Knowledge Distillation (KD) transfers teacher model knowledge to student model.
FlyKD generates pseudo labels on the fly, surpassing traditional KD limitations.
Curriculum Learning aids in optimizing student model over noisy pseudo labels.
FlyKD outperforms vanilla KD and LSPGCN in link prediction tasks.
Future research direction: improving optimization over noisy pseudo labels.
Stats
"Empirically, we observe that FlyKD outperforms vanilla KD and the renown Local Structure Preserving Graph Convolutional Network (LSPGCN)."
"We show that by storing the probability scores on the links of newly generated random graph per epoch, we can generate 100-1000x or more pseudo labels beyond the threshold of the traditional KD methods."
Quotes
"Generating tremendous amount of pseudo labels comes at a cost: the pseudo labels are extra noisy."