The paper investigates the numerical approximation capacity of neural networks with bounded parameters. It makes the following key points:
The Universal Approximation Theorem suggests that neural networks can theoretically have unlimited approximation capacity, but this assumes unbounded parameters. In practical scenarios with bounded parameters, the approximation capacity may be limited.
The authors introduce a new concept called the ε-outer measure to quantify the approximation capacity under finite numerical tolerance. This allows comparing the approximation capacity of different function families.
Theoretically, the authors show that random parameter networks (like ELM) and backpropagation networks are equivalent in their approximation capacity when the network width tends to infinity. This allows analyzing the capacity based on random parameter networks.
However, in a bounded Nonlinear Parameter Space (NPspace), the authors prove that the infinite-dimensional space spanned by neural functions can only be approximated by a finite-dimensional vector space. The dimensionality of this space, called the Numerical Span Dimension (NSdim), quantifies the approximation capacity limit.
Numerically, the authors show that the Hidden Layer Output Matrix has only a finite number of numerically non-zero singular values, leading to width redundancy and neuron correlation in wide networks. This explains the effectiveness of regularization techniques like L1 and L2.
The analysis of NSdim provides insights into the trade-off between width, depth, and parameter space size in neural networks, as well as the phenomenon of neuron condensation.
Na inny język
z treści źródłowej
arxiv.org
Głębsze pytania