Core Concepts
This survey provides a detailed taxonomy and analysis of state-of-the-art algorithmic solutions for addressing the problem of long-tailed classification in deep learning.
Abstract
This survey presents a comprehensive overview of deep learning techniques for addressing the challenge of long-tailed classification. It makes the following key contributions:
Provides a taxonomy of algorithmic-level solutions, categorizing them into four main branches: Loss Reweighting, Margin-based Logit Adjustment, Optimized Representation Learning, and Balanced Classifier Learning.
Describes the intuition and mathematical formulations behind the methods in each category, highlighting their interconnections and dependencies.
Discusses efficient metrics and strategies for evaluating and comparing the performance of state-of-the-art long-tail classification algorithms, including standard metrics, convergence studies, classifier analysis, and feature distribution analysis.
Identifies existing challenges, research gaps, and potential future directions in deep long-tail classification, particularly in the areas of online learning and zero-shot learning.
The survey covers the key advancements in this field over the past few years, offering researchers and practitioners a unified understanding of the various algorithmic techniques and their trade-offs for addressing the long-tailed classification problem.
Stats
"Many data distributions in the real world are hardly uniform. Instead, skewed and long-tailed distributions of various kinds are commonly observed."
"In training data, some classes tend to have a significantly larger number of samples compared to the other classes causing a long-tailed distribution."
"Machine learning in such settings, traditional machine learning or deep learning, creates an inherent bias towards majority classes during training."
Quotes
"When the class imbalance ratio increases, the class margin for the minority class grows thinner. In other words, the minority class tend to fit the data, leaving insufficient generalization."
"Learning from imbalanced data remains a challenging research problem and a problem that must be solved as we move towards more real-world applications of deep learning."