Conceptos Básicos
MoM learning method improves NER performance by addressing data imbalance.
Resumen
The paper discusses the challenges of data imbalance in named entity recognition (NER) tasks, particularly in natural language processing. It introduces a novel learning method called majority or minority (MoM) learning to tackle this issue effectively. MoM learning focuses on incorporating the loss computed for samples belonging to the majority class into the conventional ML model's loss. By doing so, it aims to prevent misclassifications of minority classes as the majority class, thereby enhancing prediction performance without sacrificing accuracy. The study evaluates MoM learning on four NER datasets in Japanese and English, demonstrating consistent performance improvements across different languages and frameworks.
The content delves into the notation used for sequential labeling in NER tasks and explains how MoM learning functions by adding the loss associated with the majority class to the conventional loss function. It simplifies weight adjustments compared to other methods like weighted cross-entropy and focal loss. The experiments conducted show that MoM learning outperforms existing methods, including state-of-the-art techniques like focal loss and dice loss, across various datasets and frameworks.
Furthermore, the paper highlights the importance of focusing on entity classes' performance rather than just overall scores, emphasizing practical significance. It also discusses challenges faced by traditional weighting methods like weighted cross-entropy when applied to multiclass NER tasks with long-tail distributions. The results showcase MoM learning's effectiveness in both sequential labeling and machine reading comprehension frameworks, indicating its adaptability and superior performance.
Estadísticas
Evaluation experiments on four NER datasets (Japanese and English)
Performance comparison using BERT in sequential labeling
Comparison of entity performance with and without MoM learning
Performance summary using MRC on CoNLL2003
Citas
"MoM learning is a simple and effective method that suppresses misclassifications of majority as minority classes."
"MoM learning outperforms existing methods across various datasets and frameworks."
"The results confirm MoM learning consistently improves prediction performance without sacrificing accuracy."