Neural network landscapes exhibit surprising connectivity and geometric properties, including star-shaped connectivity between global minima and near-convex geodesic distances, even for highly non-convex models.
The authors propose a structure-guided Gauss-Newton (SgGN) method that effectively utilizes both the least squares structure and the neural network structure of the objective function to solve optimization problems involving shallow ReLU neural networks.
A genetic algorithm-based quantization-aware approximation method (GQA-LUT) is proposed to efficiently handle diverse non-linear operations in Transformers using integer-only arithmetic.