Core Concepts
Introducing a versatile toolkit for path-norm analysis in modern neural networks, providing insights into generalization bounds and network complexity.
Abstract
This work introduces a toolkit for path-norm analysis in modern neural networks, offering insights into generalization bounds and network complexity. The toolkit covers various network architectures with skip connections, pooling layers, and biases. It provides a new generalization bound based on L1 path-norm for ReLU networks, outperforming existing bounds. The study explores the ease of computation and invariance properties of path-norms under different symmetries. Numerical evaluations on ResNets trained on ImageNet reveal significant discrepancies between theoretical bounds and practical observations.
Stats
Max-pooling kernel size K = 9 for ResNet152.
L1 path-norm of pretrained ResNets: 1.3 × 10^30.
L2 path-norm of pretrained ResNets: 2.5 × 10^2.
L4 path-norm of pretrained ResNets: 7.2 × 10^-6.
Quotes
"The immediate interests of these tools are: 1) path-norms are easy to compute on modern networks via a single forward-pass; 2) path-norms are invariant under neuron permutations and parameter rescalings that leave the network invariant; and 3) the path-norms yield Lipschitz bounds."
"Path-norms tightly lower bound products of operator norms, another complexity measure that does not enjoy the same invariances as path-norms."