The paper investigates different notions of linear connectivity of neural networks modulo permutation. It makes the following key observations:
Existing evidence only supports "weak linear connectivity" - that for each pair of networks, there exist permutations that linearly connect them.
The stronger claim of "strong linear connectivity" - that for each network, there exists one permutation that simultaneously connects it with other networks - is both intuitively and practically more desirable, as it would imply a convex loss landscape after accounting for permutation.
The paper introduces an intermediate claim of "simultaneous weak linear connectivity" - that for certain sequences of networks, there exists one permutation that simultaneously aligns matching pairs of networks from these sequences.
The paper provides empirical evidence for simultaneous weak linear connectivity:
The paper also discusses limitations of weight matching and activation matching algorithms used for aligning networks, and how they relate to network stability and feature emergence during training.
Naar een andere taal
vanuit de broninhoud
arxiv.org
Belangrijkste Inzichten Gedestilleerd Uit
by Ekansh Sharm... om arxiv.org 04-10-2024
https://arxiv.org/pdf/2404.06498.pdfDiepere vragen