Yaoxiang Zhang, Junteng Ma, Ze Zhang, Zhaoyang Dong, Shuang Wang
{"title":"MSIDiff:用于蛋白质特异性3D分子生成的多阶段相互作用感知扩散模型","authors":"Yaoxiang Zhang, Junteng Ma, Ze Zhang, Zhaoyang Dong, Shuang Wang","doi":"10.1016/j.eswa.2025.129820","DOIUrl":null,"url":null,"abstract":"<div><div>Structure-based drug design (SBDD) focuses on developing 3D ligand molecules that bind with high affinity to specific protein targets, which requires the accurate capture of the complex interactions between proteins and ligands. Although existing diffusion models have demonstrated potential in molecular generation tasks, they typically consider only a single stage of the generation process. This limitation prevents them from integrating the multi-stage protein-ligand interaction information from both forward and reverse processes, which may negatively impact the binding affinity of the generated molecules. To address this problem, MSIDiff (<strong>M</strong>ulti-<strong>S</strong>tage <strong>I</strong>nteraction-Aware <strong>Diff</strong>usion Model), a multi-stage interaction-aware diffusion model for protein-specific molecule generation, is proposed. MSIDiff leverages the pre-trained model MSINet to extract authentic protein-ligand interaction information during the initial diffusion stage and incorporates this information into the reverse process to ensure that the generated molecules accurately interact with target proteins. Through a scoring mechanism, MSIDiff filters key nodes to extract crucial protein-ligand interaction data and employs a GRU-based cross-layer interaction update module to recursively integrate information across different denoising stages, facilitating effective cross-layer information transmission. Experimental results on the CrossDocked2020 dataset show that MSIDiff can generate molecules with more realistic 3D structures and higher binding affinity to protein targets, achieving an Avg. Vina Score of up to -6.36, while maintaining appropriate molecular properties.Our code and data are available at: <span><span>https://github.com/zhangyaoxiang/MSIDiff</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"298 ","pages":"Article 129820"},"PeriodicalIF":7.5000,"publicationDate":"2025-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"MSIDiff:Multi-stage interaction-aware diffusion model for protein-specific 3D molecule generation\",\"authors\":\"Yaoxiang Zhang, Junteng Ma, Ze Zhang, Zhaoyang Dong, Shuang Wang\",\"doi\":\"10.1016/j.eswa.2025.129820\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Structure-based drug design (SBDD) focuses on developing 3D ligand molecules that bind with high affinity to specific protein targets, which requires the accurate capture of the complex interactions between proteins and ligands. Although existing diffusion models have demonstrated potential in molecular generation tasks, they typically consider only a single stage of the generation process. This limitation prevents them from integrating the multi-stage protein-ligand interaction information from both forward and reverse processes, which may negatively impact the binding affinity of the generated molecules. To address this problem, MSIDiff (<strong>M</strong>ulti-<strong>S</strong>tage <strong>I</strong>nteraction-Aware <strong>Diff</strong>usion Model), a multi-stage interaction-aware diffusion model for protein-specific molecule generation, is proposed. MSIDiff leverages the pre-trained model MSINet to extract authentic protein-ligand interaction information during the initial diffusion stage and incorporates this information into the reverse process to ensure that the generated molecules accurately interact with target proteins. Through a scoring mechanism, MSIDiff filters key nodes to extract crucial protein-ligand interaction data and employs a GRU-based cross-layer interaction update module to recursively integrate information across different denoising stages, facilitating effective cross-layer information transmission. Experimental results on the CrossDocked2020 dataset show that MSIDiff can generate molecules with more realistic 3D structures and higher binding affinity to protein targets, achieving an Avg. Vina Score of up to -6.36, while maintaining appropriate molecular properties.Our code and data are available at: <span><span>https://github.com/zhangyaoxiang/MSIDiff</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":50461,\"journal\":{\"name\":\"Expert Systems with Applications\",\"volume\":\"298 \",\"pages\":\"Article 129820\"},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2025-09-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Expert Systems with Applications\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0957417425034359\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417425034359","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
MSIDiff:Multi-stage interaction-aware diffusion model for protein-specific 3D molecule generation
Structure-based drug design (SBDD) focuses on developing 3D ligand molecules that bind with high affinity to specific protein targets, which requires the accurate capture of the complex interactions between proteins and ligands. Although existing diffusion models have demonstrated potential in molecular generation tasks, they typically consider only a single stage of the generation process. This limitation prevents them from integrating the multi-stage protein-ligand interaction information from both forward and reverse processes, which may negatively impact the binding affinity of the generated molecules. To address this problem, MSIDiff (Multi-Stage Interaction-Aware Diffusion Model), a multi-stage interaction-aware diffusion model for protein-specific molecule generation, is proposed. MSIDiff leverages the pre-trained model MSINet to extract authentic protein-ligand interaction information during the initial diffusion stage and incorporates this information into the reverse process to ensure that the generated molecules accurately interact with target proteins. Through a scoring mechanism, MSIDiff filters key nodes to extract crucial protein-ligand interaction data and employs a GRU-based cross-layer interaction update module to recursively integrate information across different denoising stages, facilitating effective cross-layer information transmission. Experimental results on the CrossDocked2020 dataset show that MSIDiff can generate molecules with more realistic 3D structures and higher binding affinity to protein targets, achieving an Avg. Vina Score of up to -6.36, while maintaining appropriate molecular properties.Our code and data are available at: https://github.com/zhangyaoxiang/MSIDiff.
期刊介绍:
Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.