Core Concepts
Explanatory multiverse embraces the multiplicity of counterfactual explanations and captures the spatial geometry of journeys leading to them, enabling more informed and personalized navigation of alternative recourse options.
Abstract
The paper introduces the novel concept of "explanatory multiverse" to address the limitations of current counterfactual explanation approaches. Counterfactual explanations are popular for interpreting decisions of opaque machine learning models, but existing methods treat each counterfactual path independently, neglecting the spatial relations between them.
The authors formalize explanatory multiverse, which encompasses all possible counterfactual journeys and their geometric properties, such as affinity, branching, divergence, and convergence. They propose two methods to navigate and reason about this multiverse:
Vector space interpretation: Counterfactual paths are represented as normalized vectors, allowing comparison of journeys of varying lengths. Branching points are identified, and directional differences between paths are computed.
Directed graph interpretation: Counterfactual paths are modeled as a directed graph, where vertices represent data points and edges capture feature changes. This approach accounts for feature monotonicity and allows quantifying branching factors and loss of opportunity.
The key benefits of explanatory multiverse include:
Granting explainees agency by allowing them to select counterfactuals based on the properties of the journey, not just the final destination.
Reducing the cognitive load of explainees by recognizing spatial (dis)similarity of counterfactuals and streamlining exploration.
Aligning with human modes of counterfactual thinking and supporting interactive, dialogue-based explainability.
Uncovering disparities in access to counterfactual recourse, enabling fairness considerations.
The authors demonstrate the capabilities of their approaches on synthetic and real-world data sets, and discuss future research directions, such as incorporating complex dynamics and explanation representativeness.