The researchers explore the use of Neural Additive Models (NAMs) for explainable automatic short answer grading (ASAG). NAMs combine the performance of neural networks with the interpretability of additive models, allowing stakeholders to understand which features of a student response are important for the predicted grade.
The researchers use a Knowledge Integration (KI) framework to guide feature engineering, creating inputs that reflect whether a student includes certain ideas in their response. They hypothesize that the inclusion (or exclusion) of these predefined KI ideas as features will be sufficient for the NAM to have good predictive power and interpretability.
The performance of the NAM is compared to a logistic regression (LR) model using the same features, and a non-explainable neural model, DeBERTa, that does not require feature engineering. The results show that the NAM outperforms the LR model in terms of the Quadratic Weighted Cohen's Kappa (QWK) metric, a standard ASAG evaluation metric, on the KI data at a statistically significant level. While the DeBERTa model performs better than the NAM, the difference is not statistically significant.
The researchers provide visualizations of the NAM's feature importance and shape functions, which allow stakeholders to understand which ideas in the student responses are most indicative of the assigned grade and how the model makes its predictions. This interpretability is a key advantage of the NAM over black-box neural models.
The findings suggest that NAMs may be a suitable alternative to legacy explainable models for ASAG, providing intelligibility without sacrificing too much predictive performance. The researchers note that further investigation is needed to generalize the results to different question types and domains.
To Another Language
from source content
arxiv.org
Key Insights Distilled From
by Aubrey Condo... at arxiv.org 05-02-2024
https://arxiv.org/pdf/2405.00489.pdfDeeper Inquiries