핵심 개념
MatchXML proposes a novel approach to efficiently match text samples with relevant labels in extreme multi-label text classification. The method leverages dense label embeddings and fine-tuned Transformer models to achieve state-of-the-art accuracies and training speed.
초록
MatchXML introduces an efficient framework for extreme multi-label text classification, utilizing dense label embeddings and fine-tuned Transformer models. The method outperforms competitors in accuracy and training speed across various datasets.
The content discusses the challenges of eXtreme Multi-label text Classification (XMC) and proposes MatchXML as a solution. It focuses on the generation of dense label embeddings, hierarchical label trees, and text-label matching using bipartite graphs. Experimental results show superior performance compared to existing methods.
Key points include the use of label2vec for semantic dense label embeddings, Hierarchical Label Tree construction, and the formulation of multi-label text classification as a text-label matching problem. MatchXML achieves state-of-the-art accuracies on multiple datasets by combining sparse TF-IDF features with dense vector features.
The proposed method involves training dense label vectors, constructing Hierarchical Label Trees, fine-tuning Transformer models, and utilizing static sentence embeddings. By combining different types of features, MatchXML demonstrates improved performance in extreme multi-label text classification tasks.
통계
MatchXML achieves state-of-the-art accuracies on five out of six datasets.
MatchXML outperforms competing methods on all six datasets in terms of training speed.
인용구
"We propose MatchXML, an efficient text-label matching framework for XMC."
"Experimental results demonstrate that MatchXML achieves the state-of-the-art accuracies on five out of six datasets."