Core Concepts
A novel framework, MergeNet, facilitates seamless knowledge transfer between models with different architectures, tasks, and modalities by bridging the disparities in their parameter spaces.
Abstract
The paper introduces MergeNet, a novel framework for heterogeneous knowledge transfer across diverse models, tasks, and modalities. Unlike conventional knowledge transfer methods that rely on shared elements within model structures or task-specific features/labels, MergeNet focuses on the intrinsic properties of model parameters as the natural carriers of knowledge.
The core mechanism of MergeNet is the Low-rank Parametric Knowledge Adapter (LPKA), which operates by querying the source model's low-rank parameters and learning to map them into the target model's parameter space. This allows for direct interaction, extraction, and application of knowledge between heterogeneous models.
MergeNet is learned alongside both models, enabling dynamic transfer and adaptation of knowledge, including the training trajectory knowledge of the source model. Extensive experiments demonstrate significant improvements in challenging settings, where representative approaches may falter or prove less applicable.
The authors explore various scenarios, including cross-structure, cross-modal, and cross-task knowledge transfer, as well as self-knowledge transfer within a single model. The results show that MergeNet consistently outperforms existing knowledge transfer methods, highlighting its versatility and robustness in bridging the gap between heterogeneous models.
Stats
"The parameter sharing method is ineffective for heterogeneous knowledge transfer, and in fact, may lead to a loss of accuracy due to the incompatibility of knowledge."
"MergeNet significantly improves model performance and surpasses the widely-used knowledge distillation techniques."
Quotes
"Unlike previous knowledge transfer methods, we consider knowledge transfer between models from a different perspective. We pivot on the intrinsic properties of model parameters, regarding them as the natural carriers of knowledge."
"MergeNet ingeniously orchestrates the mapping of model parameters into a low-dimensional parameter space, thereby harmonizing and aligning this space to facilitate a seamless and efficient knowledge transfer."