Core Concepts
There exists an optimal neural network architecture with a fixed number of O(d) intrinsic neurons that can approximate any d-variate continuous function on a d-dimensional hypercube with arbitrary accuracy, which is the best possible scaling with the input dimension.
Abstract
The paper presents two main results:
Existence of an EUAF (Elementary Universal Activation Function) neural network with only 366d + 365 intrinsic (non-repeated) neurons that can approximate any d-variate continuous function on a d-dimensional hypercube with arbitrary accuracy. This is a significant improvement over the previous work that required O(d^2) neurons.
The key insights are:
Leveraging a variant of the Kolmogorov Superposition Theorem that only requires 1 outer function and 2d+1 inner functions, instead of the original version that needs 2d+1 outer functions and (2d+1)(d+1) inner functions.
Constructing the EUAF network to approximate the outer and inner functions separately, and then combining them.
Ensuring the range of the linear combination of the inner function approximations is [0, 1] by applying min{max{·, 0}, 1}.
Presentation of a family of continuous functions that requires at least width d (or d intrinsic neurons) to achieve arbitrary accuracy in its approximation.
Combining these results, the paper concludes that the requirement of O(d) fixed intrinsic neurons for approximating functions in C([a, b]^d) is optimal, as it grows linearly with the input dimension d, unlike some other approximation methods where the number of parameters may grow exponentially with d.