The paper discusses the limitations of SHAP scores (Shapley Additive Explanations) for feature attribution in machine learning models. Recent work has uncovered examples of classifiers where SHAP scores assign misleading importance to features, with features having no influence on the prediction being assigned more importance than features that are most influential.
The paper argues that these issues are not due to the theoretical foundations of Shapley values, but rather to the characteristic functions used in earlier works to define SHAP scores. The paper outlines several key properties that characteristic functions should respect in order to compute SHAP scores that do not exhibit the identified issues:
The paper then proposes several novel characteristic functions that respect one or more of these properties. It also analyzes the computational complexity of computing SHAP scores using the new characteristic functions, showing that they are as hard or harder to compute than the characteristic functions used in earlier works.
Finally, the paper proposes a simple modification to the SHAP tool to use one of the novel characteristic functions, which eliminates some of the limitations reported for SHAP scores.
To Another Language
from source content
arxiv.org
Djupare frågor