基准测试基于3D结构的分子生成器。

IF 5.3 2区化学 Q1 CHEMISTRY, MEDICINAL

Journal of Chemical Information and Modeling Pub Date : 2025-07-25 DOI:10.1021/acs.jcim.5c01020

Natasha Sanjrani, Damien E Coupry, Peter Pogány, David S Palmer, Stephen D Pickett

{"title":"基准测试基于3D结构的分子生成器。","authors":"Natasha Sanjrani, Damien E Coupry, Peter Pogány, David S Palmer, Stephen D Pickett","doi":"10.1021/acs.jcim.5c01020","DOIUrl":null,"url":null,"abstract":"To understand the benefits and drawbacks of 3D combinatorial and deep learning generators, a novel benchmark was created focusing on the recreation of important protein-ligand interactions and 3D ligand conformations. Using the BindingMOAD data set with a hold-out blind set, the sequential graph neural network generators, Pocket2Mol and PocketFlow, diffusion models, DiffSBDD and MolSnapper, and combinatorial genetic algorithms, AutoGrow4 and LigBuilderV3, were evaluated. It was discovered that deep learning methods fail to generate structurally valid molecules and 3D conformations, whereas combinatorial methods are slow and generate molecules that are prone to failing 2D MOSES filters. The results from this evaluation guide us toward improving deep learning structure-based generators by placing higher importance on structural validity, 3D ligand conformations, and recreation of important known active site interactions. This benchmark should be used to understand the limitations of future combinatorial and deep learning generators. The package is freely available under an Apache 2.0 license at github.com/gskcheminformatics/SBDD-benchmarking.","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.3000,"publicationDate":"2025-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Benchmarking 3D Structure-Based Molecule Generators.\",\"authors\":\"Natasha Sanjrani, Damien E Coupry, Peter Pogány, David S Palmer, Stephen D Pickett\",\"doi\":\"10.1021/acs.jcim.5c01020\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"To understand the benefits and drawbacks of 3D combinatorial and deep learning generators, a novel benchmark was created focusing on the recreation of important protein-ligand interactions and 3D ligand conformations. Using the BindingMOAD data set with a hold-out blind set, the sequential graph neural network generators, Pocket2Mol and PocketFlow, diffusion models, DiffSBDD and MolSnapper, and combinatorial genetic algorithms, AutoGrow4 and LigBuilderV3, were evaluated. It was discovered that deep learning methods fail to generate structurally valid molecules and 3D conformations, whereas combinatorial methods are slow and generate molecules that are prone to failing 2D MOSES filters. The results from this evaluation guide us toward improving deep learning structure-based generators by placing higher importance on structural validity, 3D ligand conformations, and recreation of important known active site interactions. This benchmark should be used to understand the limitations of future combinatorial and deep learning generators. The package is freely available under an Apache 2.0 license at github.com/gskcheminformatics/SBDD-benchmarking.\",\"PeriodicalId\":44,\"journal\":{\"name\":\"Journal of Chemical Information and Modeling \",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":5.3000,\"publicationDate\":\"2025-07-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Chemical Information and Modeling \",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://doi.org/10.1021/acs.jcim.5c01020\",\"RegionNum\":2,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CHEMISTRY, MEDICINAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Chemical Information and Modeling ","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1021/acs.jcim.5c01020","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MEDICINAL","Score":null,"Total":0}

引用次数: 0

摘要

为了了解3D组合和深度学习生成器的优缺点，我们创建了一个新的基准，专注于重建重要的蛋白质-配体相互作用和3D配体构象。使用带有盲集的BindingMOAD数据集，对顺序图神经网络生成器Pocket2Mol和PocketFlow、扩散模型DiffSBDD和MolSnapper以及组合遗传算法AutoGrow4和LigBuilderV3进行了评估。研究发现，深度学习方法无法生成结构有效的分子和三维构象，而组合方法速度较慢，生成的分子容易无法通过2D MOSES滤波器。该评估的结果指导我们通过更加重视结构有效性、3D配体构象和重要的已知活性位点相互作用的重建，来改进基于结构的深度学习生成器。这个基准应该用来理解未来组合和深度学习生成器的局限性。该软件包在Apache 2.0许可下可在github.com/gskcheminformatics/SBDD-benchmarking免费获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Benchmarking 3D Structure-Based Molecule Generators.

To understand the benefits and drawbacks of 3D combinatorial and deep learning generators, a novel benchmark was created focusing on the recreation of important protein-ligand interactions and 3D ligand conformations. Using the BindingMOAD data set with a hold-out blind set, the sequential graph neural network generators, Pocket2Mol and PocketFlow, diffusion models, DiffSBDD and MolSnapper, and combinatorial genetic algorithms, AutoGrow4 and LigBuilderV3, were evaluated. It was discovered that deep learning methods fail to generate structurally valid molecules and 3D conformations, whereas combinatorial methods are slow and generate molecules that are prone to failing 2D MOSES filters. The results from this evaluation guide us toward improving deep learning structure-based generators by placing higher importance on structural validity, 3D ligand conformations, and recreation of important known active site interactions. This benchmark should be used to understand the limitations of future combinatorial and deep learning generators. The package is freely available under an Apache 2.0 license at github.com/gskcheminformatics/SBDD-benchmarking.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Chemical Information and Modeling 化学-化学综合

CiteScore

9.80

自引率

10.70%

发文量

529

审稿时长

1.4 months

期刊介绍： The Journal of Chemical Information and Modeling publishes papers reporting new methodology and/or important applications in the fields of chemical informatics and molecular modeling. Specific topics include the representation and computer-based searching of chemical databases, molecular modeling, computer-aided molecular design of new materials, catalysts, or ligands, development of new computational methods or efficient algorithms for chemical software, and biopharmaceutical chemistry including analyses of biological activity and other issues related to drug discovery. Astute chemists, computer scientists, and information specialists look to this monthly’s insightful research studies, programming innovations, and software reviews to keep current with advances in this integral, multidisciplinary field. As a subscriber you’ll stay abreast of database search systems, use of graph theory in chemical problems, substructure search systems, pattern recognition and clustering, analysis of chemical and physical data, molecular modeling, graphics and natural language interfaces, bibliometric and citation analysis, and synthesis design and reactions databases.