Core Concepts
A multiobjective continuation method is presented to efficiently compute the regularization path of high-dimensional deep neural networks, which connects the sparsest solution and the non-regularized solution.
Abstract
The paper presents a multiobjective continuation method to efficiently compute the regularization path of deep neural networks (DNNs). The key insights are:
The regularization path of DNNs can be formulated as a multiobjective optimization problem, where the objectives are the empirical loss and the L1 norm of the weights (sparsity).
The authors extend the concept of regularization paths from linear models to high-dimensional nonlinear DNNs by using a multiobjective proximal gradient method. This allows for an efficient computation of the entire Pareto front, connecting the sparsest solution and the non-regularized solution.
The predictor-corrector approach used in the continuation method enables a structured way of training DNNs, starting from a very sparse model and gradually increasing the number of weights as long as overfitting is avoided.
Numerical experiments on the Iris, MNIST, and CIFAR10 datasets demonstrate the superiority of the continuation method over the weighted sum approach and evolutionary algorithms, especially in high-dimensional settings.
The authors show that the knowledge of the regularization path allows for the selection of well-generalizing network parametrizations, providing an alternative to standard pruning techniques.
Stats
The paper does not contain any explicit numerical data or statistics. The key results are presented in the form of plots and qualitative comparisons.
Quotes
"To the best of our knowledge, this is the first algorithm to compute the regularization path for non-convex multiobjective optimization problems (MOPs) with millions of degrees of freedom."
"We show the first algorithm that solves a truly high-dimensional deep learning problem in a very efficient manner."