核心概念
There exists an optimal neural network architecture with a fixed number of O(d) intrinsic neurons that can approximate any d-variate continuous function on a d-dimensional hypercube with arbitrary accuracy, which is the best possible scaling with the input dimension.
摘要
The paper presents two main results:
- Existence of an EUAF (Elementary Universal Activation Function) neural network with only 366d + 365 intrinsic (non-repeated) neurons that can approximate any d-variate continuous function on a d-dimensional hypercube with arbitrary accuracy. This is a significant improvement over the previous work that required O(d^2) neurons.
The key insights are:
- Leveraging a variant of the Kolmogorov Superposition Theorem that only requires 1 outer function and 2d+1 inner functions, instead of the original version that needs 2d+1 outer functions and (2d+1)(d+1) inner functions.
- Constructing the EUAF network to approximate the outer and inner functions separately, and then combining them.
- Ensuring the range of the linear combination of the inner function approximations is [0, 1] by applying min{max{·, 0}, 1}.
- Presentation of a family of continuous functions that requires at least width d (or d intrinsic neurons) to achieve arbitrary accuracy in its approximation.
Combining these results, the paper concludes that the requirement of O(d) fixed intrinsic neurons for approximating functions in C([a, b]^d) is optimal, as it grows linearly with the input dimension d, unlike some other approximation methods where the number of parameters may grow exponentially with d.