Core Concepts
Perturbation-based explanation methods extended to generative language models provide locally faithful explanations of generated outputs.
Abstract
Directory:
Abstract
Introduction
Challenges Addressed:
Output Text Handling
Input Length Management
Framework Overview: MExGen
Evaluation Tasks:
Summarization
Context-grounded Question Answering
Comparison with Existing Methods
Automated Evaluation Results:
Unit Ranking Similarity Across Scalarizers
Perturbation Curves Analysis
Area Under the Perturbation Curve (AUPC)
User Study Insights
Key Highlights:
Proposal of MExGen framework for generative language models.
Addressing challenges of output text and input length in attribution methods.
Systematic evaluation showing superiority of MExGen in providing locally faithful explanations.
Comparison with existing methods like P-SHAP and CaptumLIME.
User study revealing perceptions on fidelity, preference, and concentration of attribution scores.
Stats
著者は、テキスト分類に一般的に適用される摂動ベースの説明方法を生成言語モデルに拡張することに焦点を当てています。
フレームワークMExGenは、長い入力に対処するための多段階戦略を提供します。