The paper introduces MergeNet, a novel framework for heterogeneous knowledge transfer across diverse models, tasks, and modalities. Unlike conventional knowledge transfer methods that rely on shared elements within model structures or task-specific features/labels, MergeNet focuses on the intrinsic properties of model parameters as the natural carriers of knowledge.
The core mechanism of MergeNet is the Low-rank Parametric Knowledge Adapter (LPKA), which operates by querying the source model's low-rank parameters and learning to map them into the target model's parameter space. This allows for direct interaction, extraction, and application of knowledge between heterogeneous models.
MergeNet is learned alongside both models, enabling dynamic transfer and adaptation of knowledge, including the training trajectory knowledge of the source model. Extensive experiments demonstrate significant improvements in challenging settings, where representative approaches may falter or prove less applicable.
The authors explore various scenarios, including cross-structure, cross-modal, and cross-task knowledge transfer, as well as self-knowledge transfer within a single model. The results show that MergeNet consistently outperforms existing knowledge transfer methods, highlighting its versatility and robustness in bridging the gap between heterogeneous models.
Na inny język
z treści źródłowej
arxiv.org
Głębsze pytania