Core Concepts
Contrastive learning can be used to refine pre-existing embeddings and improve their performance on downstream tasks.
Abstract
The paper proposes a novel contrastive learning framework called SIMSKIP that takes pre-trained embeddings as input and refines them using a skip-connection based encoder-projector architecture.
Key highlights:
The authors identify a limitation of existing contrastive learning methods - they focus on the input data modalities but overlook the potential of refining pre-trained embeddings.
SIMSKIP incorporates skip connections to retain the expressiveness of the original embedding while learning refinements, which the authors prove theoretically does not lead to larger error bounds on downstream tasks.
Extensive experiments on various datasets and tasks, including knowledge graph embeddings, image classification, node classification, and text embeddings, demonstrate the effectiveness of SIMSKIP in improving downstream performance compared to the original embeddings.
The authors also conduct an ablation study to show the importance of the skip connection in SIMSKIP's architecture.
The paper provides a principled approach to refining pre-trained embeddings using contrastive learning, which can be broadly applicable across different data modalities and downstream applications.
Quotes
"To the best of our knowledge, we are the first to propose and investigate the use of contrastive learning to improve the robustness of embedding spaces."
"We theoretically prove that after applying SIMSKIP on the input embedding space, for a downstream task, the error upper bound of the new learned fine-tuned embedding will not be larger than that of the original embedding space."