The paper conducts a fine-grained analysis of the mode connectivity phenomenon in neural network landscapes. The key findings are:
Overparameterized neural networks, including two-layer ReLU networks and linear networks, exhibit a strong form of connectivity called "star-shaped connectivity". This means there exists a "center" minimum that is linearly connected to a finite set of typical global minima.
The geodesic distance between global minima, normalized by their Euclidean distance, monotonically decreases towards 1 as the network width increases. This suggests the landscape becomes increasingly convex-like as the network becomes more overparameterized.
These theoretical results are supported by extensive experiments on MNIST and CIFAR-10 datasets, validating the star-shaped connectivity and near-convex geodesic distances for practical neural network models.
The findings provide new insights into the underlying geometry and topology of neural network loss landscapes, which have important implications for understanding optimization and generalization in deep learning.
In un'altra lingua
dal contenuto originale
arxiv.org
Approfondimenti chiave tratti da
by Zhanran Lin,... alle arxiv.org 04-10-2024
https://arxiv.org/pdf/2404.06391.pdfDomande più approfondite