Core Concepts
This paper proposes novel empirical risk functions and regularizers inspired by f-divergences and α-Rényi divergence for self-training methods in semi-supervised learning, such as pseudo-labeling and entropy minimization. These divergence-based approaches are more robust to noisy pseudo-labels compared to traditional self-training methods.
Abstract
The paper investigates a range of empirical risk functions and regularization methods suitable for self-training methods in semi-supervised learning. These approaches draw inspiration from various divergence measures, such as f-divergences and α-Rényi divergences.
The key highlights and insights are:
- Proposed novel empirical risk functions (DERs) based on f-divergences and α-Rényi divergence for supervised learning applications.
- Adapted the DERs for semi-supervised learning (SSL) scenarios, specifically for pseudo-labeling and entropy minimization techniques.
- Provided an upper bound on the true risk of some DERs under the fully supervised learning (FSL) scenario, when the divergence is a metric distance.
- Demonstrated the robustness of the proposed DERs to noisy pseudo-labels generated by self-training methods, compared to traditional approaches.
- Introduced two algorithms, DP-SSL and DEM-SSL, that leverage the proposed DERs for pseudo-labeling and entropy minimization in SSL, respectively.
- Empirical analysis showed that the DERs, particularly JS-ERM, outperform traditional self-training methods in the presence of noisy pseudo-labels.
Stats
The paper does not provide any specific numerical data or statistics. It focuses on the theoretical development of divergence-based empirical risk functions and their application to semi-supervised learning.
Quotes
"Our empirical risk functions are more robust to noisy pseudo-labels (i.e., the pseudo-label is different from the true label) of unlabeled data samples, which are generated by self-training approaches."
"Inspired by some divergences properties, we also provide an upper bound on the true risk of some empirical risk functions."