Core Concepts
This study introduces a novel Generative Adversarial Network (GAN) called InstGAN that efficiently generates molecules with multi-property optimization, outperforming various baseline models.
Abstract
The paper presents a novel Generative Adversarial Network (GAN) called InstGAN for efficient molecular generation with multi-property optimization. Key highlights:
InstGAN utilizes an autoregressive generator and a token-level discriminator to generate SMILES strings. This allows for dense reward allocation at the token-level.
InstGAN employs an actor-critic reinforcement learning (RL) algorithm to calculate instant and global rewards, which improves training stability and scalability compared to previous MCTS-based RL approaches.
The inclusion of maximized information entropy (MIE) in the generator's loss function helps to mitigate mode collapse and promote diversity in molecular generation.
Experimental results on the ZINC and ChEMBL datasets demonstrate that InstGAN outperforms various baseline models, including VAE-, flow-, diffusion-, and GAN-based approaches, in terms of validity, uniqueness, novelty, and total score.
InstGAN is capable of efficiently generating molecules with single-property and multi-property optimization, achieving substantial improvements in the targeted chemical properties compared to the training datasets.
Ablation studies highlight the importance of the key components of InstGAN, including instant rewards, global rewards, and MIE, in achieving high-quality molecular generation.
Case studies showcase InstGAN's ability to generate molecules with high drug-likeness (QED) and dopamine receptor D2 (DRD2) activity, which are highly similar to approved drugs.
Stats
The ZINC dataset contains 250,000 drug-like molecules, with a median of 27 and a maximum of 88 heavy atoms per molecule.
The ChEMBL dataset includes approximately 1.6 million molecules, with a median of 27 and a maximum of 88 heavy atoms per molecule.
Quotes
"This study introduces a novel GAN based on actor-critic RL with instant rewards (IR) and global rewards (GR), called InstGAN, to generate molecules at the token-level with multi-property optimization."
"Experimental results validate that InstGAN outperforms other baselines, achieves comparable performance to state-of-the-art (SOTA) models, and demonstrates the ability to generate molecules with multi-property optimization in a fast and efficient manner."