The paper proposes a novel training framework called Structural Compression (StructComp) to address the scalability challenge of graph contrastive learning (GCL). StructComp is motivated by a sparse low-rank approximation of the diffusion matrix, which allows the encoder to be trained without performing any message passing.
The key idea is to use a graph partition matrix to compress the nodes, such that nodes in the same cluster share the same embedding. This reduces the number of sample pairs needed for the contrastive loss computation and eliminates the need for the encoder to perform message passing during training.
For single-view GCL models, StructComp replaces the GNN encoder with a simpler MLP encoder during training, and then transfers the learned parameters to the original GNN encoder for inference. For multi-view GCL models, StructComp introduces a novel data augmentation method called "DropMember" to generate different representations of the compressed nodes.
The paper provides theoretical analysis to show that the compressed contrastive loss can approximate the original GCL loss, and that StructComp introduces an additional regularization term that makes the encoder more robust.
Extensive experiments on various datasets demonstrate that StructComp significantly reduces the time and memory consumption of GCL training while improving model performance compared to the vanilla GCL models and other scalable training methods.
다른 언어로
소스 콘텐츠 기반
arxiv.org
더 깊은 질문