Core Concepts
Neural networks can be mathematically modeled using convection-diffusion equations, providing a unified framework for understanding and improving network structures.
Stats
For any T > 0, by introducing a temporal partition ∆t = T/L, the residual block represented by (1) can be viewed as the explicit Euler discretization with time step ∆t for the following ordinary differential equation (ODE): dx(t)/dt = v(x(t), t), x(0) = x0, t ∈[0, T].
Furthermore, the connection between ODEs and partial differential equations (PDEs) through the well-known characteristics method has motivated the analysis of ResNets from a PDE perspective. This includes theoretical analysis Sonoda et al. [2019], novel training algorithms Sun Qi and Qiang [2020], and improvements in adversarial robustness Wang et al. [2020a] for NNs.
The method of characteristics tells us that, along the curve defined by (2), the function value u(x, t) remains unchanged.
Quotes
"NN can be viewed as the image u(·, t) of a mapping driven by a certain PDE." - Content Source