The authors propose an approach called Combine-GNN to solve the phylogenetic tree containment problem using Graph Neural Networks (GNNs). The key ideas are:
Combining the given phylogenetic network and tree into a single graph, respecting the leaf labels. This allows the GNN to be aware of the leaf labels while enabling inductive learning ability to handle instances with more leaves than the training data.
Using a directed GNN (Dir-GNN) to effectively capture the directed nature of the phylogenetic graphs.
Extracting multi-scale node representations by concatenating embeddings from different GNN layers, and using a readout operation to obtain the graph-level prediction.
The authors demonstrate that Combine-GNN achieves over 95% balanced accuracy on synthetic test instances with up to 100 leaves, outperforming baseline approaches. It also shows promising performance on real-world phylogenetic datasets. The runtime analysis indicates that Combine-GNN scales polynomially, in contrast to the exponential time complexity of the exact tree containment algorithm.
The authors also conduct extensive ablation studies to analyze the impact of different design choices of Combine-GNN, such as the use of directed message passing, node features, GNN architectures, and embedding sizes.
To Another Language
from source content
arxiv.org
Key Insights Distilled From
by Arkadiy Dush... at arxiv.org 04-16-2024
https://arxiv.org/pdf/2404.09812.pdfDeeper Inquiries