The paper proposes a novel method called BLOOD (Between Layer Out-Of-Distribution Detection) for detecting OOD data in Transformer-based models. The key insight is that the transformations between intermediate layers of a Transformer network tend to be smoother for in-distribution (ID) data compared to OOD data.
The paper first demonstrates this empirically, showing that the ID data transformations are generally smoother than OOD data transformations, especially in the upper layers of the network. This is because the learning algorithm focuses on smoothing the transformations in the ID region of the representation space during training, while the OOD region is largely left unchanged.
The BLOOD method quantifies the smoothness of between-layer transformations using the Frobenius norm of the Jacobian matrix. To make this computationally feasible, an unbiased estimator is derived using Jacobian-vector products.
The authors evaluate BLOOD on several text classification tasks using pre-trained Transformer models (RoBERTa and ELECTRA) and show that it outperforms other state-of-the-art white-box and black-box OOD detection methods. The improvements are more prominent on more complex datasets, where the learning algorithm has to make more substantial changes to the model, leading to a greater difference in smoothness between ID and OOD transformations.
Additionally, the paper analyzes the performance of BLOOD on different types of distribution shifts, finding that it is more effective in detecting background shift than semantic shift. This is attributed to BLOOD's focus on the encoding process of the input, rather than just the model's outputs.
翻译成其他语言
从原文生成
arxiv.org
更深入的查询