Core Concepts
The computation of SHAP explanations for the class of weighted automata and disjoint DNFs, which includes decision trees, can be performed efficiently under the assumption of Markovian distributions.
Abstract
The article investigates the computational complexity of the SHAP (Shapley Additive Explanations) score, a widely used framework for local interpretability of machine learning models. The authors focus on the case where the underlying model is a weighted automaton (WA) and the background data distribution is Markovian.
The key contributions are:
The authors provide a constructive proof showing that the computation of the SHAP score for the class of WAs is tractable under the Markovian assumption. This result extends the existing positive complexity results on SHAP score computation, which were mostly derived under the feature independence assumption.
The authors show that under the same Markovian assumption, the computation of the SHAP score for the class of disjoint DNFs (which includes decision trees) is also tractable. This is achieved by a polynomial-time reduction from the SHAP problem for disjoint DNFs to the SHAP problem for WAs.
The proof strategy involves decomposing the SHAP score computation into a set of operations over languages and seq2seq languages represented by WAs and weighted transducers (WTs). The authors then construct WAs and WTs that can compute these languages/seq2seq languages efficiently under the Markovian assumption.
This work provides a formal argument to substantiate the claim that WAs enjoy better transparency than their neural counterparts, as they allow for efficient SHAP score computation under realistic data distributions.