Core Concepts
Label smoothing negatively affects selective classification by exacerbating overconfidence and underconfidence, degrading model performance.
Abstract
The content explores the impact of label smoothing on selective classification performance. It reveals that label smoothing leads to a consistent degradation in selective classification despite improving accuracy. The analysis focuses on logit-level gradients, explaining how label smoothing affects overconfidence and underconfidence. Additionally, the effectiveness of logit normalization in recovering degraded performance caused by label smoothing is highlighted. Experimental details, results, and related work are discussed comprehensively.
Introduction
Label smoothing (LS) is a popular regularization method for deep neural networks.
LS negatively impacts selective classification (SC) by exacerbating overconfidence and underconfidence.
Logit normalization is effective in mitigating the degradation caused by LS.
Preliminaries
LS redistributes probability mass to improve classification accuracy.
SC aims to reject misclassifications based on predictive uncertainty.
MSP is used as a measure of uncertainty for SC.
Experimental Details
Evaluation conducted on ImageNet and Cityscapes datasets.
Training models from scratch with varying levels of label smoothing (LS).
Logit normalization applied to recover lost SC performance caused by LS.
Results
LS consistently degrades SC performance across different architectures and tasks.
Overconfidence and underconfidence are exacerbated by LS, impacting model's ability to reject misclassifications.
Logit normalization effectively improves SC performance for LS-trained models.
Related Work
Comparison with prediction with rejection methods like out-of-distribution detection.
Discussion on Mixup techniques and their impact on softening training targets.
Exploration of label smoothing beyond selective classification scenarios.
Stats
LS leads to noticeable degradation in SC performance despite improving accuracy.
LS weakens the MSP score's ability to differentiate between correct vs incorrect predictions.
Quotes
"Label smoothing increasingly regularizes the max logit as the true probability of error decreases."
"Logit normalization effectively reverses the imbalanced regularisation caused by label smoothing."