Concepts de base
The core message of this article is that missing data problems can be viewed as a form of causal inference, where the goal is to identify the complete data distribution from the observed data distribution by leveraging graphical representations and counterfactual reasoning.
Résumé
The article discusses the connections between causal inference and missing data problems, and how ideas from causal inference can be used to analyze and identify missing data models.
Key highlights:
- The authors introduce a counterfactual view of classical missing data models, where each missing variable is represented as a counterfactual variable that would have been observed had the corresponding missingness indicator been set to 1.
- The authors describe how directed acyclic graphs (DAGs) can be used to encode independence restrictions in both causal and missing data models, and how identification theory developed for causal inference can be adapted to missing data problems.
- The authors present a hierarchy of missing data DAG models, ranging from missing completely at random (MCAR) to missing not at random (MNAR), and discuss how the complexity of identification techniques required depends on the type of missingness mechanism.
- The authors review several examples of missing data DAG models from the literature, showing how the graphical representations can provide intuitive interpretations of the missingness mechanisms and facilitate identification of the target parameters.
- The authors suggest that ideas explored in missing data DAG models, combined with rank preservation assumptions, may lead to novel identification results in causal inference settings.