Core Concepts
Proposing bi-encoder-based detectors for superior out-of-distribution detection in NLP without labeled OOD samples.
Abstract
This paper introduces a novel method using bi-encoder-based detectors for out-of-distribution (OOD) detection in Natural Language Processing (NLP). The study compares various OOD detection methods, including Universal Sentence Encoder (USE), BERT, MPNET, and GLOVE, on datasets like CLINC150, ROSTD-Coarse, SNIPS, and YELLOW. Results show that the proposed bi-encoder-based detectors outperform other methods across all datasets. The approach simplifies training by not requiring labeled OOD samples and demonstrates high scalability and real-world applicability. The study provides valuable insights into the effectiveness of bi-encoder-based detectors for OOD detection in NLP.
Stats
Performance is assessed using metrics such as F1-Score, MCC, FPR@90, FPR@95, AUPR, an AUROC.
Experimental results demonstrate that the proposed bi-encoder-based detectors outperform other methods across all datasets.