The authors introduce "geom2vec", a method that leverages pretrained graph neural networks (GNNs) as geometric featurizers for analyzing molecular dynamics simulations. The key idea is to decouple the training of the GNN encoder from the training of downstream task-specific models, enabling the use of large, diverse datasets for pretraining the GNN and efficient analysis of molecular dynamics data with limited computational resources.
The authors first pretrain a GNN using a self-supervised denoising objective on a large dataset of molecular conformations. This allows the GNN to learn transferable structural representations that capture molecular geometric patterns without further fine-tuning. They then use the pretrained GNN as a feature encoder to analyze molecular dynamics trajectories of three fast-folding proteins (chignolin, trp-cage, and villin) using two downstream tasks: learning slowly decorrelating modes with VAMPnets and identifying metastable states with the state predictive information bottleneck (SPIB) framework.
The results demonstrate that the GNN-based representations can capture important structural features, such as side chain dynamics, that are missed by approaches based on manually selected internal coordinates. The authors also show that decoupling GNN training from downstream task training significantly reduces the computational requirements compared to training the GNN and downstream models jointly.
To Another Language
from source content
arxiv.org
Key Insights Distilled From
by Zihan Pengme... at arxiv.org 10-01-2024
https://arxiv.org/pdf/2409.19838.pdfDeeper Inquiries