{"title":"MGDM:使用多项扩散模型的分子生成。","authors":"Sisi Yuan , Chen Zhao , Lin Liu , Guifei Zhou","doi":"10.1016/j.ymeth.2025.03.001","DOIUrl":null,"url":null,"abstract":"<div><div>Accurate analysis of molecular structures and the rapid generation of valid molecules remain significant challenges in De Novo drug design. In this study, we propose the <u>M</u>ultinomial <u>G</u>enerated <u>D</u>iffusion <u>M</u>odel (MGDM) for molecular generation. This model leverages a multinomial diffusion framework to process discrete data, with a focus on learning the multinomial distribution inherent in the dataset. During the generation process, the model progressively denoises molecules, transitioning from a uniform noise distribution to ultimately produce valid molecular structures. Initially, we generate molecules unconditionally to expand the compound library. In the next phase, we focus on generating molecules with specific properties to assess the model’s capacity for conditional generation. For this, we implement a classifier-free guidance strategy, which directs the diffusion model’s task without the need for training separate classifier models. To validate the effectiveness of our framework, we conducted experiments using the Molecular Sets (MOSES) dataset. The results demonstrate that, compared to several state-of-the-art methods, MGDM generates valid molecules while achieving superior or comparable performance in terms of novelty and diversity.</div></div>","PeriodicalId":390,"journal":{"name":"Methods","volume":"239 ","pages":"Pages 1-9"},"PeriodicalIF":4.2000,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"MGDM: Molecular generation using a multinomial diffusion model\",\"authors\":\"Sisi Yuan , Chen Zhao , Lin Liu , Guifei Zhou\",\"doi\":\"10.1016/j.ymeth.2025.03.001\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Accurate analysis of molecular structures and the rapid generation of valid molecules remain significant challenges in De Novo drug design. In this study, we propose the <u>M</u>ultinomial <u>G</u>enerated <u>D</u>iffusion <u>M</u>odel (MGDM) for molecular generation. This model leverages a multinomial diffusion framework to process discrete data, with a focus on learning the multinomial distribution inherent in the dataset. During the generation process, the model progressively denoises molecules, transitioning from a uniform noise distribution to ultimately produce valid molecular structures. Initially, we generate molecules unconditionally to expand the compound library. In the next phase, we focus on generating molecules with specific properties to assess the model’s capacity for conditional generation. For this, we implement a classifier-free guidance strategy, which directs the diffusion model’s task without the need for training separate classifier models. To validate the effectiveness of our framework, we conducted experiments using the Molecular Sets (MOSES) dataset. The results demonstrate that, compared to several state-of-the-art methods, MGDM generates valid molecules while achieving superior or comparable performance in terms of novelty and diversity.</div></div>\",\"PeriodicalId\":390,\"journal\":{\"name\":\"Methods\",\"volume\":\"239 \",\"pages\":\"Pages 1-9\"},\"PeriodicalIF\":4.2000,\"publicationDate\":\"2025-03-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Methods\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1046202325000532\",\"RegionNum\":3,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Methods","FirstCategoryId":"99","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1046202325000532","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
MGDM: Molecular generation using a multinomial diffusion model
Accurate analysis of molecular structures and the rapid generation of valid molecules remain significant challenges in De Novo drug design. In this study, we propose the Multinomial Generated Diffusion Model (MGDM) for molecular generation. This model leverages a multinomial diffusion framework to process discrete data, with a focus on learning the multinomial distribution inherent in the dataset. During the generation process, the model progressively denoises molecules, transitioning from a uniform noise distribution to ultimately produce valid molecular structures. Initially, we generate molecules unconditionally to expand the compound library. In the next phase, we focus on generating molecules with specific properties to assess the model’s capacity for conditional generation. For this, we implement a classifier-free guidance strategy, which directs the diffusion model’s task without the need for training separate classifier models. To validate the effectiveness of our framework, we conducted experiments using the Molecular Sets (MOSES) dataset. The results demonstrate that, compared to several state-of-the-art methods, MGDM generates valid molecules while achieving superior or comparable performance in terms of novelty and diversity.
期刊介绍:
Methods focuses on rapidly developing techniques in the experimental biological and medical sciences.
Each topical issue, organized by a guest editor who is an expert in the area covered, consists solely of invited quality articles by specialist authors, many of them reviews. Issues are devoted to specific technical approaches with emphasis on clear detailed descriptions of protocols that allow them to be reproduced easily. The background information provided enables researchers to understand the principles underlying the methods; other helpful sections include comparisons of alternative methods giving the advantages and disadvantages of particular methods, guidance on avoiding potential pitfalls, and suggestions for troubleshooting.