Delayed Bottlenecking Pre-training: Alleviating Forgetting in Pre-trained Graph Neural Networks
The core message of this article is that the traditional pre-training and fine-tuning strategy for graph neural networks can lead to information forgetting, which is detrimental to the performance on downstream tasks. The authors propose a novel Delayed Bottlenecking Pre-training (DBP) framework that maintains as much mutual information as possible between the latent representations and the training data during the pre-training phase, and delays the compression operation to the fine-tuning phase to ensure it is guided by the labeled fine-tuning data and downstream tasks.