Core Concepts
Replacing costly node-wise repulsion in skip-gram graph embeddings with efficient dimension-wise regularization can preserve dissimilarity while improving scalability and representation quality.
Abstract
The content discusses an efficient approach to preserving dissimilarity in graph embeddings. The key insights are:
The negative function in skip-gram graph embedding objectives, which enforces dissimilarity between node embeddings, can be approximated via dimension regularization instead of costly node-wise repulsion.
As the need for node repulsion grows (i.e., the embeddings start collapsing due to only optimizing for similarity), the dimension regularization approach converges to optimizing the original skip-gram negative function.
The authors propose an algorithm augmentation framework that replaces skip-gram negative sampling (SGNS) with dimension regularization. This reduces the time complexity from linear in the number of nodes to linear in the number of embedding dimensions.
Empirical evaluations show that the augmented LINE and node2vec algorithms preserve downstream performance on link prediction tasks while dramatically reducing training runtime compared to the original algorithms using SGNS.
The authors also demonstrate that the Positive Only baselines, which remove the negative function entirely, are sensitive to graph connectivity and prone to collapsing embeddings, whereas the dimension regularization approach is more robust.