Novel molecule design with POWGAN, a policy-optimized Wasserstein generative adversarial network

IF 5.7 2区化学 Q1 CHEMISTRY, MULTIDISCIPLINARY

Journal of Cheminformatics Pub Date : 2025-12-01 DOI:10.1186/s13321-025-01114-0

Bruno Macedo, Inês Ribeiro Vaz, Tiago Taveira Gomes

{"title":"Novel molecule design with POWGAN, a policy-optimized Wasserstein generative adversarial network","authors":"Bruno Macedo, Inês Ribeiro Vaz, Tiago Taveira Gomes","doi":"10.1186/s13321-025-01114-0","DOIUrl":null,"url":null,"abstract":"<p>Generative artificial intelligence has the potential to open new vast chemical search spaces, yet existing reinforcement-guided generative adversarial networks (GANs) struggle to produce non-fragmented and property-oriented molecules at scale without compromising other properties. To overcome these limitations, we present Policy-Optimised Wasserstein GAN (POWGAN), a graph-based generator that incorporates a dynamically scaled reward into adversarial training. The scaling factor increases when progress stalls, keeping gradients informative while steadily steering the generator towards user-defined objectives. When POWGAN replaces the loss function in a previous MedGAN architecture, using graph connectivity (non-fragmentation) as the target property, attains 1.00 fully connected quinoline-like molecules, compared to previous 0.62, while maintaining novelty (0.93) and uniqueness (0.95). The resulting model R-MedGAN produces > 12,000 novel quinoline-like, a significant increase over its predecessor under identical experimental conditions. Chemical space visualizations demonstrate that these molecules populate regions not present in the training dataset or MedGAN, confirming genuine scaffold innovation. By achieving a new architecture capable of orienting generative process towards a reward, our study also showed this strategy is capable of progressing towards druglikeness properties. Synthetic Accessibility Scores (SAS) measured by Erlth algorithm between 1 and 6, and lipophilicity measured as LogP between 1.35 and 1.80, both increased the proportion from 8 to 65% and 17% to 45%, respectively, compared to baseline. Our study shows R-MedGAN architecture, incorporating POWGAN loss, is also generalizable for models trained with different molecular scaffolds other than quinoline originally tested in MedGAN (R-MedGAN-QNL). For indole (R-MedGAN-IND) and imidazole (R-MedGAN-IMZ) datasets, connectivity increased from 0.38 and 0.50 up to 1.00 during training. This study provides evidence that an adaptive reward-scaling policy in a Wasserstein GAN can simultaneously guide the generative training towards a reward by enhancing molecular connectivity, expand generative throughput, preserve diversity, and improve drug-likeness properties. By eliminating the limitation trade-off between property optimisation and sample diversity, POWGAN and its R-MedGAN implementation advance the state of the art in molecule-generating GANs and deploys a robust, scalable platform for high-throughput, goal-directed chemical exploration in early-stage drug discovery. These findings underscore the effectiveness of adaptive reinforcement-driven strategies in generative adversarial networks oriented by rewards for molecular discovery.</p><p>In this work we introduce POWGAN, a policy-optimized Wasserstein GAN that uses adaptive reward scaling to improve goal-directed molecule generation. Integrated into MedGAN (R-MedGAN), it increases the number of valid, connected, and novel molecules under identical settings while maintaining diversity and drug-likeness. This demonstrates that adaptive reward strategies can jointly enhance molecular topology and property optimization at scale.</p>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"18 1","pages":""},"PeriodicalIF":5.7000,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1186/s13321-025-01114-0.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Cheminformatics","FirstCategoryId":"92","ListUrlMain":"https://link.springer.com/article/10.1186/s13321-025-01114-0","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

Abstract

Generative artificial intelligence has the potential to open new vast chemical search spaces, yet existing reinforcement-guided generative adversarial networks (GANs) struggle to produce non-fragmented and property-oriented molecules at scale without compromising other properties. To overcome these limitations, we present Policy-Optimised Wasserstein GAN (POWGAN), a graph-based generator that incorporates a dynamically scaled reward into adversarial training. The scaling factor increases when progress stalls, keeping gradients informative while steadily steering the generator towards user-defined objectives. When POWGAN replaces the loss function in a previous MedGAN architecture, using graph connectivity (non-fragmentation) as the target property, attains 1.00 fully connected quinoline-like molecules, compared to previous 0.62, while maintaining novelty (0.93) and uniqueness (0.95). The resulting model R-MedGAN produces > 12,000 novel quinoline-like, a significant increase over its predecessor under identical experimental conditions. Chemical space visualizations demonstrate that these molecules populate regions not present in the training dataset or MedGAN, confirming genuine scaffold innovation. By achieving a new architecture capable of orienting generative process towards a reward, our study also showed this strategy is capable of progressing towards druglikeness properties. Synthetic Accessibility Scores (SAS) measured by Erlth algorithm between 1 and 6, and lipophilicity measured as LogP between 1.35 and 1.80, both increased the proportion from 8 to 65% and 17% to 45%, respectively, compared to baseline. Our study shows R-MedGAN architecture, incorporating POWGAN loss, is also generalizable for models trained with different molecular scaffolds other than quinoline originally tested in MedGAN (R-MedGAN-QNL). For indole (R-MedGAN-IND) and imidazole (R-MedGAN-IMZ) datasets, connectivity increased from 0.38 and 0.50 up to 1.00 during training. This study provides evidence that an adaptive reward-scaling policy in a Wasserstein GAN can simultaneously guide the generative training towards a reward by enhancing molecular connectivity, expand generative throughput, preserve diversity, and improve drug-likeness properties. By eliminating the limitation trade-off between property optimisation and sample diversity, POWGAN and its R-MedGAN implementation advance the state of the art in molecule-generating GANs and deploys a robust, scalable platform for high-throughput, goal-directed chemical exploration in early-stage drug discovery. These findings underscore the effectiveness of adaptive reinforcement-driven strategies in generative adversarial networks oriented by rewards for molecular discovery.

In this work we introduce POWGAN, a policy-optimized Wasserstein GAN that uses adaptive reward scaling to improve goal-directed molecule generation. Integrated into MedGAN (R-MedGAN), it increases the number of valid, connected, and novel molecules under identical settings while maintaining diversity and drug-likeness. This demonstrates that adaptive reward strategies can jointly enhance molecular topology and property optimization at scale.

查看原文本刊更多论文

基于策略优化的Wasserstein生成对抗网络POWGAN的新型分子设计。

生成式人工智能有潜力开辟新的广阔的化学搜索空间，但现有的强化引导生成式对抗网络（gan）难以在不影响其他特性的情况下大规模生产非碎片化和属性导向的分子。为了克服这些限制，我们提出了策略优化的沃瑟斯坦GAN (POWGAN)，这是一种基于图的生成器，它将动态缩放的奖励整合到对抗训练中。当进度停止时，比例因子增加，保持梯度信息，同时稳定地将生成器转向用户定义的目标。当POWGAN取代先前MedGAN架构中的损失函数时，使用图连通性（非碎片化）作为目标属性，获得1.00个完全连接的喹啉类分子，而之前的为0.62个，同时保持新颖性（0.93）和唯一性（0.95）。由此产生的R-MedGAN模型在相同的实验条件下产生了1,000,000个新型喹啉样物质，比其前身显着增加。化学空间可视化表明，这些分子填充了训练数据集或MedGAN中不存在的区域，证实了真正的支架创新。通过实现一种能够将生成过程导向奖励的新架构，我们的研究还表明，这种策略能够朝着类似药物的特性发展。与基线相比，Erlth算法测量的综合可达性评分（SAS）在1 ~ 6之间，亲脂性LogP在1.35 ~ 1.80之间，两者的比例分别从8增加到65%和17%增加到45%。我们的研究表明，包含POWGAN损失的R-MedGAN结构也可用于除最初在MedGAN中测试的喹啉以外的不同分子支架训练的模型（R-MedGAN- qnl）。对于吲哚（R-MedGAN-IND）和咪唑（R-MedGAN-IMZ）数据集，在训练期间连通性从0.38和0.50增加到1.00。本研究提供了证据，表明Wasserstein GAN中的自适应奖励尺度策略可以通过增强分子连通性、扩大生成吞吐量、保持多样性和改善药物相似性来同时引导生成训练向奖励方向发展。通过消除性质优化和样品多样性之间的限制权衡，POWGAN及其R-MedGAN实现推进了分子生成gan的最新技术，并为早期药物发现的高通量、目标导向的化学探索部署了一个强大的、可扩展的平台。这些发现强调了自适应强化驱动策略在以分子发现奖励为导向的生成对抗网络中的有效性。科学贡献：在这项工作中，我们介绍了POWGAN，一种策略优化的Wasserstein GAN，它使用自适应奖励缩放来改进目标导向的分子生成。整合到MedGAN （R-MedGAN）中，它在保持多样性和药物相似性的同时，增加了相同设置下有效、连接和新分子的数量。这表明自适应奖励策略可以在规模上共同增强分子拓扑和性质优化。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Cheminformatics CHEMISTRY, MULTIDISCIPLINARY-COMPUTER SCIENCE, INFORMATION SYSTEMS

CiteScore

14.10

自引率

7.00%

发文量

审稿时长

3 months

期刊介绍： Journal of Cheminformatics is an open access journal publishing original peer-reviewed research in all aspects of cheminformatics and molecular modelling. Coverage includes, but is not limited to: chemical information systems, software and databases, and molecular modelling, chemical structure representations and their use in structure, substructure, and similarity searching of chemical substance and chemical reaction databases, computer and molecular graphics, computer-aided molecular design, expert systems, QSAR, and data mining techniques.