The paper conducts a fine-grained analysis of the mode connectivity phenomenon in neural network landscapes. The key findings are:
Overparameterized neural networks, including two-layer ReLU networks and linear networks, exhibit a strong form of connectivity called "star-shaped connectivity". This means there exists a "center" minimum that is linearly connected to a finite set of typical global minima.
The geodesic distance between global minima, normalized by their Euclidean distance, monotonically decreases towards 1 as the network width increases. This suggests the landscape becomes increasingly convex-like as the network becomes more overparameterized.
These theoretical results are supported by extensive experiments on MNIST and CIFAR-10 datasets, validating the star-shaped connectivity and near-convex geodesic distances for practical neural network models.
The findings provide new insights into the underlying geometry and topology of neural network loss landscapes, which have important implications for understanding optimization and generalization in deep learning.
إلى لغة أخرى
من محتوى المصدر
arxiv.org
الرؤى الأساسية المستخلصة من
by Zhanran Lin,... في arxiv.org 04-10-2024
https://arxiv.org/pdf/2404.06391.pdfاستفسارات أعمق