Core Concepts
Causal discovery can be formulated as a knowledge graph completion problem, where the task of discovering causal relations is mapped to the task of knowledge graph link prediction.
Abstract
The paper presents a novel approach called CausalDisco that formulates causal discovery as a knowledge graph completion problem. The approach involves four primary phases:
- Encoding known causal relations into a causal network.
- Translating the causal network into a causal knowledge graph (CausalKG).
- Learning knowledge graph embeddings for the CausalKG, including embeddings with and without causal weights (CausalKGE-W and CausalKGE-Base).
- Predicting new causal links in the CausalKG using the learned embeddings.
The approach supports two types of causal discovery: causal explanation (predicting the type of a cause-entity given an effect-entity) and causal prediction (predicting the type of an effect-entity given a cause-entity).
The evaluation is performed on the CLEVRER-Humans benchmark dataset, which contains simulated videos of collision events with human-annotated causal relations and weights. The results show that incorporating causal weights into the knowledge graph embeddings (CausalKGE-W) improves causal discovery performance compared to embeddings without causal weights (CausalKGE-Base). The paper also introduces a novel Markov-based data split technique to address potential model bias issues in the evaluation.
Stats
The causal weight represents the strength of the causal association between entities in the knowledge graph, measured by the total causal effect estimated using do-calculus.
The CLEVRER-Humans dataset contains 764 causal event graphs (CEGs) after pre-processing.
The CausalKG derived from the CLEVRER-Humans dataset contains over 48K links, 5664 entities, 31 entity types, and 10 relations.
Quotes
"Causal discovery is defined as the process of finding new causal relations by analyzing observational data [2]."
"The newly discovered causal relations are encoded as a causal network with edges representing the causal links between entities. Each causal link may also be annotated with weights representing the strength of the causal connection."