Core Concepts
Generating counterfactual explanations with an explicit cardinality constraint to provide more interpretable and easily understandable explanations for machine learning model predictions.
Abstract
The content discusses the problem of generating counterfactual explanations for machine learning models, which are examples that differ from a given input only in the prediction target and some set of features. The main challenge with counterfactual explanations is that they can have many different features than the original example, making them difficult to interpret.
The paper proposes to explicitly add a cardinality constraint to the counterfactual generation process, limiting the number of features that can be different from the original example. This is implemented as an extension to the CERTIFAI framework, a model-agnostic approach for generating counterfactual explanations.
The results show that the cardinality-constrained counterfactuals are more easily interpretable compared to the unconstrained ones. For example, a counterfactual with a maximum of 2 or 3 different features can be easily understood as "the target would change if the age is 15 and the NaToK ratio increases to 22.92".
The paper also provides additional experiments on the Car Evaluation dataset, further demonstrating the effectiveness of the cardinality-constrained approach in generating sparse and interpretable counterfactual explanations.
Stats
"Age 16, Sex M, BP LOW, Cholesterol HIGH, NaToK 12.006"
"Age 17, Sex M, BP NORMAL, Cholesterol NORMAL, NaToK 11.29"
"Age 15, Sex M, BP LOW, Cholesterol HIGH, NaToK 22.82"
"Age 15, Sex M, BP HIGH, Cholesterol HIGH, NaToK 11.04"
Quotes
"Even if a counterfactual is close to the original example in feature space (say, in terms of the Euclidean distance between x and x̂), slight changes in a high number of features can have a negative effect on its interpretability."
"We forked the publicly available repository from the authors and implemented an additional cardinality constraint by penalizing those individuals with a cardinality (number of modified features with respect to the input example) higher than the target value k."