Conceitos Básicos
The Weisfeiler-Leman (WL) test is commonly used to measure the expressive power of graph neural networks, but this approach has significant limitations and ethical implications that are often overlooked.
Resumo
The paper systematically analyzes the reliability and validity of using the Weisfeiler-Leman (WL) test to measure the expressive power of graph neural networks (GNNs). The key insights are:
- Conceptualization of expressive power:
- Graph ML practitioners have varying and sometimes conflicting conceptualizations of expressive power.
- Many practitioners believe expressive power is solely an architectural property, captured by the ability to distinguish non-isomorphic graphs/nodes.
- Limitations of the WL test:
- WL test does not guarantee isometry, can be irrelevant to real-world graph tasks, and may not promote generalization or trustworthiness.
- WL test can have negative implications for fairness, robustness, and privacy of graph ML models.
- Benchmark analysis:
- 1-WL can distinguish effectively all non-isomorphic graphs/nodes in many popular graph ML benchmarks.
- GNNs may learn representations more optimal for task labels than WL-aligned representations.
- Implications:
- Graph ML practitioners should recognize that WL may not align with their task, and devise other measurements of expressive power.
- Alternatively, if WL does not limit GNN performance on benchmarks, more rigorous benchmarks are needed to assess expressive power.
- The paper argues for extensional definitions and measurement of expressive power, and provides guiding questions to facilitate the creation of such benchmarks.
Estatísticas
1-WL can distinguish effectively all the non-isomorphic graphs and nodes in many graph ML benchmarks.
GNNs may learn representations that are more optimal with respect to the labels for a task than WL-aligned.
Citações
"𝑘-WL does not guarantee isometry, can be irrelevant to real-world graph tasks, and may not promote generalization or trustworthiness."
"Comparing to 𝑘-WL has poor structural validity."
"𝑘-WL can have negative implications for the fairness, robustness, and privacy of graph ML."