核心概念
Explanation-based Bias Decoupling Regularization (EBD-Reg) trains natural language inference models to distinguish and decouple task-relevant keywords from biases, enabling them to focus on the intended features and improve out-of-distribution inference performance.
摘要
The content discusses a novel method called Explanation-based Bias Decoupling Regularization (EBD-Reg) for improving the robustness of Transformer-based Natural Language Inference (NLI) models.
Key highlights:
- Transformer-based NLI models tend to rely more on dataset biases than on the intended task-relevant features, compromising their robustness.
- Traditional debiasing methods focus on identifying which samples are biased, but do not specify which parts within a sample are biased, limiting their ability to handle out-of-distribution inference.
- EBD-Reg is inspired by how humans explain causal relationships, focusing on the main contradictions that differentiate between cause and effect.
- EBD-Reg establishes a tripartite parallel supervision of Distinguishing (identifying keywords and biases), Decoupling (encouraging the model to focus on keywords while suppressing biases), and Aligning (aligning the joint predictive distribution of keyword and bias inference with the main inference).
- Extensive experiments show that EBD-Reg can be easily integrated with various Transformer-based encoders, significantly outperforming other debiasing methods in out-of-distribution inference performance.
統計資料
"A girl cannot be washing a load of laundry while playing a violin."
"Replacements of words appearing in explanations lead to a noticeable accuracy drop, with further reductions for words not intersecting in the premise and hypothesis."
引述
"While traditional NLI debiasing methods teach models 'which samples are biased', our approach, rooted in human explanation, aims to instruct models 'which parts of a sample are biased'."
"Inspired by this human aptitude, we thoroughly analyze the inherent connection between human explanations and biases at word level in Section IV-A, summarizing criteria for distinguishing keywords and biases from a human perspective."