Core Concepts
Causal machine learning methods like the causal forest can flexibly estimate heterogeneous treatment effects, but the black-box nature of these models poses challenges for transparency that are important to address for both usability by analysts and accountability to the public.
Abstract
The paper discusses the use of causal machine learning, particularly the causal forest method, for estimating heterogeneous treatment effects in the context of policy evaluation. It highlights two key transparency issues that need to be addressed:
Usability:
Causal machine learning models are generally black-box, making it difficult for analysts to understand the underlying data-generating process and gain insights into the causal effects.
Existing tools like explainable AI (XAI) and interpretable AI (IAI) can help provide transparency, but need to be adapted for the causal setting.
Understanding nuisance models used for identification, as well as the final causal model, is important for usability.
Transparency is crucial for analysts to properly interpret and weigh the evidence from these complex models.
Accountability:
The black-box nature of causal machine learning models makes it difficult to hold decision-makers accountable, as the reasoning behind estimates may not be clear.
Transparency is important for the public to understand how policy decisions are being made and to identify potential unfairness or injustice.
The distance between causal models and real-world impact, with human decision-makers in the loop, changes the importance of transparency compared to purely predictive applications.
The paper then applies these ideas to a case study estimating the heterogeneous returns to education in Australia using the Household Income and Labour Dynamics in Australia (HILDA) survey. It demonstrates the limitations of existing XAI and IAI tools for providing the necessary transparency in causal machine learning, and concludes that new tools are needed to properly understand these models and the algorithms that fit them.
Stats
"The causal forest produces an APE estimate of $5753 per additional year of education with a standard error of $316."
Quotes
"Causal machine learning is beginning to be used in analysis that informs public policy. Particular techniques which estimate individual or group-level effects of interventions are the focus of this paper."
"The black-box nature of these methods makes them very different from traditional causal estimation models."
"Transparency is precisely the way in which models do this informing, taking a model of hundreds of thousands of parameters in the case of a typical causal forest and presenting the patterns in those parameters in a way that can tell the user about the underlying causal effects."