Conceitos essenciais
This paper establishes tight generalization bounds for training deep neural networks with ReLU activation and logistic loss in binary classification problems. The authors develop a novel theoretical analysis to overcome the challenges posed by the unboundedness of the target function for the logistic loss.
Resumo
The paper focuses on the binary classification problem using deep neural networks (DNNs) with the rectified linear unit (ReLU) activation function, where the logistic loss (also known as the cross entropy loss) is used as the loss function.
Key highlights and insights:
The authors develop an elegant oracle-type inequality to deal with the unboundedness of the target function for the logistic loss, which is the main obstacle in deriving satisfactory generalization bounds.
Using the oracle-type inequality, the authors establish tight generalization bounds for fully connected ReLU DNN classifiers trained by empirical logistic risk minimization. They obtain optimal convergence rates (up to some logarithmic factors) for the excess logistic risk and excess misclassification error under various conditions, such as:
When the conditional class probability function is Hölder smooth
Under a compositional assumption on the conditional class probability function, which can explain the success of DNNs in overcoming the curse of dimensionality
When the decision boundary is piecewise smooth and the input data are bounded away from it
The authors justify the optimality of the derived convergence rates by proving corresponding minimax lower bounds.
As a key technical contribution, the authors derive a new tight error bound for the approximation of the unbounded natural logarithm function by ReLU DNNs, which plays a crucial role in establishing the optimal convergence rates.
Overall, the paper provides a novel theoretical analysis and tight generalization bounds for binary classification with deep neural networks and the logistic loss, which significantly advance the understanding of this important problem.