Bruno Macedo, Inês Ribeiro Vaz, Tiago Taveira Gomes
{"title":"Novel molecule design with POWGAN, a policy-optimized Wasserstein generative adversarial network","authors":"Bruno Macedo, Inês Ribeiro Vaz, Tiago Taveira Gomes","doi":"10.1186/s13321-025-01114-0","DOIUrl":null,"url":null,"abstract":"<p>Generative artificial intelligence has the potential to open new vast chemical search spaces, yet existing reinforcement-guided generative adversarial networks (GANs) struggle to produce non-fragmented and property-oriented molecules at scale without compromising other properties. To overcome these limitations, we present Policy-Optimised Wasserstein GAN (POWGAN), a graph-based generator that incorporates a dynamically scaled reward into adversarial training. The scaling factor increases when progress stalls, keeping gradients informative while steadily steering the generator towards user-defined objectives. When POWGAN replaces the loss function in a previous MedGAN architecture, using graph connectivity (non-fragmentation) as the target property, attains 1.00 fully connected quinoline-like molecules, compared to previous 0.62, while maintaining novelty (0.93) and uniqueness (0.95). The resulting model R-MedGAN produces > 12,000 novel quinoline-like, a significant increase over its predecessor under identical experimental conditions. Chemical space visualizations demonstrate that these molecules populate regions not present in the training dataset or MedGAN, confirming genuine scaffold innovation. By achieving a new architecture capable of orienting generative process towards a reward, our study also showed this strategy is capable of progressing towards druglikeness properties. Synthetic Accessibility Scores (SAS) measured by Erlth algorithm between 1 and 6, and lipophilicity measured as LogP between 1.35 and 1.80, both increased the proportion from 8 to 65% and 17% to 45%, respectively, compared to baseline. Our study shows R-MedGAN architecture, incorporating POWGAN loss, is also generalizable for models trained with different molecular scaffolds other than quinoline originally tested in MedGAN (R-MedGAN-QNL). For indole (R-MedGAN-IND) and imidazole (R-MedGAN-IMZ) datasets, connectivity increased from 0.38 and 0.50 up to 1.00 during training. This study provides evidence that an adaptive reward-scaling policy in a Wasserstein GAN can simultaneously guide the generative training towards a reward by enhancing molecular connectivity, expand generative throughput, preserve diversity, and improve drug-likeness properties. By eliminating the limitation trade-off between property optimisation and sample diversity, POWGAN and its R-MedGAN implementation advance the state of the art in molecule-generating GANs and deploys a robust, scalable platform for high-throughput, goal-directed chemical exploration in early-stage drug discovery. These findings underscore the effectiveness of adaptive reinforcement-driven strategies in generative adversarial networks oriented by rewards for molecular discovery.</p><p>In this work we introduce POWGAN, a policy-optimized Wasserstein GAN that uses adaptive reward scaling to improve goal-directed molecule generation. Integrated into MedGAN (R-MedGAN), it increases the number of valid, connected, and novel molecules under identical settings while maintaining diversity and drug-likeness. This demonstrates that adaptive reward strategies can jointly enhance molecular topology and property optimization at scale.</p>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"18 1","pages":""},"PeriodicalIF":5.7000,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1186/s13321-025-01114-0.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Cheminformatics","FirstCategoryId":"92","ListUrlMain":"https://link.springer.com/article/10.1186/s13321-025-01114-0","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
Generative artificial intelligence has the potential to open new vast chemical search spaces, yet existing reinforcement-guided generative adversarial networks (GANs) struggle to produce non-fragmented and property-oriented molecules at scale without compromising other properties. To overcome these limitations, we present Policy-Optimised Wasserstein GAN (POWGAN), a graph-based generator that incorporates a dynamically scaled reward into adversarial training. The scaling factor increases when progress stalls, keeping gradients informative while steadily steering the generator towards user-defined objectives. When POWGAN replaces the loss function in a previous MedGAN architecture, using graph connectivity (non-fragmentation) as the target property, attains 1.00 fully connected quinoline-like molecules, compared to previous 0.62, while maintaining novelty (0.93) and uniqueness (0.95). The resulting model R-MedGAN produces > 12,000 novel quinoline-like, a significant increase over its predecessor under identical experimental conditions. Chemical space visualizations demonstrate that these molecules populate regions not present in the training dataset or MedGAN, confirming genuine scaffold innovation. By achieving a new architecture capable of orienting generative process towards a reward, our study also showed this strategy is capable of progressing towards druglikeness properties. Synthetic Accessibility Scores (SAS) measured by Erlth algorithm between 1 and 6, and lipophilicity measured as LogP between 1.35 and 1.80, both increased the proportion from 8 to 65% and 17% to 45%, respectively, compared to baseline. Our study shows R-MedGAN architecture, incorporating POWGAN loss, is also generalizable for models trained with different molecular scaffolds other than quinoline originally tested in MedGAN (R-MedGAN-QNL). For indole (R-MedGAN-IND) and imidazole (R-MedGAN-IMZ) datasets, connectivity increased from 0.38 and 0.50 up to 1.00 during training. This study provides evidence that an adaptive reward-scaling policy in a Wasserstein GAN can simultaneously guide the generative training towards a reward by enhancing molecular connectivity, expand generative throughput, preserve diversity, and improve drug-likeness properties. By eliminating the limitation trade-off between property optimisation and sample diversity, POWGAN and its R-MedGAN implementation advance the state of the art in molecule-generating GANs and deploys a robust, scalable platform for high-throughput, goal-directed chemical exploration in early-stage drug discovery. These findings underscore the effectiveness of adaptive reinforcement-driven strategies in generative adversarial networks oriented by rewards for molecular discovery.
In this work we introduce POWGAN, a policy-optimized Wasserstein GAN that uses adaptive reward scaling to improve goal-directed molecule generation. Integrated into MedGAN (R-MedGAN), it increases the number of valid, connected, and novel molecules under identical settings while maintaining diversity and drug-likeness. This demonstrates that adaptive reward strategies can jointly enhance molecular topology and property optimization at scale.
期刊介绍:
Journal of Cheminformatics is an open access journal publishing original peer-reviewed research in all aspects of cheminformatics and molecular modelling.
Coverage includes, but is not limited to:
chemical information systems, software and databases, and molecular modelling,
chemical structure representations and their use in structure, substructure, and similarity searching of chemical substance and chemical reaction databases,
computer and molecular graphics, computer-aided molecular design, expert systems, QSAR, and data mining techniques.