Core Concepts
Human gaze data offers valuable insights as an alternative to human rationale annotations for evaluating explainability methods in NLP.
Abstract
The study compares human gaze data with rationale annotations to evaluate XAI methods. It explores factors influencing data quality, task difficulty indicators, and decoding accuracies. Results show the potential of webcam-based gaze data as a cost-effective alternative for evaluating model explanations in multilingual settings.
The research delves into the analysis of WebQAmGaze dataset, eye-tracking patterns, and model explanations. It highlights the correlation between gaze entropy, reading times, and task difficulty. The study also examines the alignment of model explanations with human signals from rationales and gaze patterns.
Factors like WebGazer accuracy variations, participant characteristics (e.g., wearing glasses), text length, answer position influence decoding accuracies. The findings suggest that better webcam accuracy leads to higher decoding accuracies. The study emphasizes the importance of considering such factors when using gaze data for evaluation.
Overall, the research demonstrates the potential of webcam-based gaze data as a complementary source of information for evaluating XAI methods in NLP tasks across different languages and models.
Stats
Recording human gaze via webcams enables collection of larger datasets.
Entropy calculated on fixation patterns is an indicator for task difficulty.
Decoding accuracies vary across languages based on gaze data.
Model explanations can decode rationales effectively.
Factors like wearing glasses affect webcam accuracy in eye-tracking studies.
Quotes
"We find that models overall show slightly higher accuracies for shorter than longer answers."
"Webcam-based eye-tracking provides useful linguistic information even in lower quality recordings."
"The study emphasizes the importance of considering factors like text length and answer position when using gaze data for evaluation."