The paper investigates different notions of linear connectivity of neural networks modulo permutation. It makes the following key observations:
Existing evidence only supports "weak linear connectivity" - that for each pair of networks, there exist permutations that linearly connect them.
The stronger claim of "strong linear connectivity" - that for each network, there exists one permutation that simultaneously connects it with other networks - is both intuitively and practically more desirable, as it would imply a convex loss landscape after accounting for permutation.
The paper introduces an intermediate claim of "simultaneous weak linear connectivity" - that for certain sequences of networks, there exists one permutation that simultaneously aligns matching pairs of networks from these sequences.
The paper provides empirical evidence for simultaneous weak linear connectivity:
The paper also discusses limitations of weight matching and activation matching algorithms used for aligning networks, and how they relate to network stability and feature emergence during training.
A otro idioma
del contenido fuente
arxiv.org
Ideas clave extraídas de
by Ekansh Sharm... a las arxiv.org 04-10-2024
https://arxiv.org/pdf/2404.06498.pdfConsultas más profundas