核心概念
Minimizing the Chebyshev Prototype Risk, which bounds the deviation in similarity between an example's features and its class prototype, reduces overfitting in deep neural networks.
要約
The content presents a theoretical framework and a new training algorithm to effectively reduce overfitting in deep neural networks.
Key highlights:
- Defines the concept of a "class prototype" as the mean feature vector for each class, and derives Chebyshev probability bounds on the deviation of an example's features from its class prototype.
- Introduces a new metric called Chebyshev Prototype Risk (CPR) that bounds the deviation in similarity between an example's features and its class prototype.
- Proposes a multi-component loss function that minimizes CPR by reducing intra-class feature covariance and maximizing inter-class prototype separation.
- Provides an efficient implementation to minimize the intra-class feature covariance terms in O(JlogJ) time, compared to previous approaches in O(J^2) time.
- Empirical results on CIFAR100 and STL10 datasets show that the proposed algorithm reduces overfitting and outperforms previous regularization techniques.
統計
The content does not provide any specific numerical data or metrics to support the key claims. It focuses on the theoretical framework and the algorithm design.
引用
"Overparameterized deep neural networks (DNNs), if not sufficiently regularized, are susceptible to overfitting their training examples and not generalizing well to test data."
"We utilize the class prototype, which is the class' mean feature vector, to derive Chebyshev probability bounds on the deviation of an example from it's class prototype and to design a new loss function that we empirically show to excel in performance and efficiency compared to previous algorithms."
"To the best of our knowledge, the first regularization algorithm to effectively optimize feature covariance in log-linear time and linear space, thus allowing our algorithm to scale effectively to large networks."