从虚拟筛选的角度对标人工智能对接方式

IF 23.9 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Nature Machine Intelligence Pub Date : 2025-02-13 DOI:10.1038/s42256-025-00993-0

Shukai Gu, Chao Shen, Xujun Zhang, Huiyong Sun, Heng Cai, Hao Luo, Huifeng Zhao, Bo Liu, Hongyan Du, Yihao Zhao, Chenggong Fu, Silong Zhai, Yafeng Deng, Huanxiang Liu, Tingjun Hou, Yu Kang

{"title":"从虚拟筛选的角度对标人工智能对接方式","authors":"Shukai Gu, Chao Shen, Xujun Zhang, Huiyong Sun, Heng Cai, Hao Luo, Huifeng Zhao, Bo Liu, Hongyan Du, Yihao Zhao, Chenggong Fu, Silong Zhai, Yafeng Deng, Huanxiang Liu, Tingjun Hou, Yu Kang","doi":"10.1038/s42256-025-00993-0","DOIUrl":null,"url":null,"abstract":"Recently, many artificial intelligence (AI)-powered protein–ligand docking and scoring methods have been developed, demonstrating impressive speed and accuracy. However, these methods often neglected the physical plausibility of the docked complexes and their efficacy in virtual screening (VS) projects. Therefore, we conducted a comprehensive benchmark analysis of four AI-powered and four physics-based docking tools and two AI-enhanced rescoring methods. We initially constructed the TrueDecoy set, a dataset on which the redocking experiments revealed that KarmaDock and CarsiDock surpassed all physics-based tools in docking accuracy, whereas all physics-based tools notably outperformed AI-based methods in structural rationality. The low physical plausibility of docked structures generated by the top AI method, CarsiDock, mainly stems from insufficient intermolecular validity. The VS results on the TrueDecoy set highlight the effectiveness of RTMScore as a rescore function, and Glide-based methods achieved the highest enrichment factors among all docking tools. Furthermore, we created the RandomDecoy set, a dataset that more closely resembles real-world VS scenarios, where AI-based tools obviously outperformed Glide. Additionally, we found that the employed ligand-based postprocessing methods had a weak or even negative impact on optimizing the conformations of docked complexes and enhancing VS performance. Finally, we proposed a hierarchical VS strategy that could efficiently and accurately enrich active molecules in large-scale VS projects. Artificial intelligence (AI)-based docking and scoring methods demonstrate considerable potential for virtual drug screening. Gu et al. go further by assessing the structural rationality of AI-predicted complex conformations from various sources.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 3","pages":"509-520"},"PeriodicalIF":23.9000,"publicationDate":"2025-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Benchmarking AI-powered docking methods from the perspective of virtual screening\",\"authors\":\"Shukai Gu, Chao Shen, Xujun Zhang, Huiyong Sun, Heng Cai, Hao Luo, Huifeng Zhao, Bo Liu, Hongyan Du, Yihao Zhao, Chenggong Fu, Silong Zhai, Yafeng Deng, Huanxiang Liu, Tingjun Hou, Yu Kang\",\"doi\":\"10.1038/s42256-025-00993-0\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recently, many artificial intelligence (AI)-powered protein–ligand docking and scoring methods have been developed, demonstrating impressive speed and accuracy. However, these methods often neglected the physical plausibility of the docked complexes and their efficacy in virtual screening (VS) projects. Therefore, we conducted a comprehensive benchmark analysis of four AI-powered and four physics-based docking tools and two AI-enhanced rescoring methods. We initially constructed the TrueDecoy set, a dataset on which the redocking experiments revealed that KarmaDock and CarsiDock surpassed all physics-based tools in docking accuracy, whereas all physics-based tools notably outperformed AI-based methods in structural rationality. The low physical plausibility of docked structures generated by the top AI method, CarsiDock, mainly stems from insufficient intermolecular validity. The VS results on the TrueDecoy set highlight the effectiveness of RTMScore as a rescore function, and Glide-based methods achieved the highest enrichment factors among all docking tools. Furthermore, we created the RandomDecoy set, a dataset that more closely resembles real-world VS scenarios, where AI-based tools obviously outperformed Glide. Additionally, we found that the employed ligand-based postprocessing methods had a weak or even negative impact on optimizing the conformations of docked complexes and enhancing VS performance. Finally, we proposed a hierarchical VS strategy that could efficiently and accurately enrich active molecules in large-scale VS projects. Artificial intelligence (AI)-based docking and scoring methods demonstrate considerable potential for virtual drug screening. Gu et al. go further by assessing the structural rationality of AI-predicted complex conformations from various sources.\",\"PeriodicalId\":48533,\"journal\":{\"name\":\"Nature Machine Intelligence\",\"volume\":\"7 3\",\"pages\":\"509-520\"},\"PeriodicalIF\":23.9000,\"publicationDate\":\"2025-02-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Nature Machine Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.nature.com/articles/s42256-025-00993-0\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nature Machine Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.nature.com/articles/s42256-025-00993-0","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

最近，许多人工智能（AI）驱动的蛋白质配体对接和评分方法已经开发出来，显示出令人印象深刻的速度和准确性。然而，这些方法往往忽略了对接物的物理合理性及其在虚拟筛选（VS）项目中的有效性。因此，我们对四种基于人工智能和四种基于物理的对接工具以及两种人工智能增强的评分方法进行了全面的基准分析。我们首先构建了TrueDecoy集，在该数据集上的再对接实验表明，karadock和CarsiDock在对接精度上超过了所有基于物理的工具，而所有基于物理的工具在结构合理性上都明显优于基于人工智能的方法。顶级AI方法CarsiDock生成的对接结构物理合理性较低，主要源于分子间有效性不足。TrueDecoy集合上的VS结果突出了RTMScore作为评分函数的有效性，基于glide的方法在所有对接工具中获得了最高的富集因子。此外，我们创建了RandomDecoy集，这是一个更接近于现实世界VS场景的数据集，其中基于ai的工具明显优于Glide。此外，我们发现采用的基于配体的后处理方法对优化对接物的构象和提高VS性能有微弱甚至负面的影响。最后，我们提出了一种分层VS策略，可以在大规模VS项目中高效、准确地富集活性分子。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Benchmarking AI-powered docking methods from the perspective of virtual screening

查看原文本刊更多论文

Benchmarking AI-powered docking methods from the perspective of virtual screening

Recently, many artificial intelligence (AI)-powered protein–ligand docking and scoring methods have been developed, demonstrating impressive speed and accuracy. However, these methods often neglected the physical plausibility of the docked complexes and their efficacy in virtual screening (VS) projects. Therefore, we conducted a comprehensive benchmark analysis of four AI-powered and four physics-based docking tools and two AI-enhanced rescoring methods. We initially constructed the TrueDecoy set, a dataset on which the redocking experiments revealed that KarmaDock and CarsiDock surpassed all physics-based tools in docking accuracy, whereas all physics-based tools notably outperformed AI-based methods in structural rationality. The low physical plausibility of docked structures generated by the top AI method, CarsiDock, mainly stems from insufficient intermolecular validity. The VS results on the TrueDecoy set highlight the effectiveness of RTMScore as a rescore function, and Glide-based methods achieved the highest enrichment factors among all docking tools. Furthermore, we created the RandomDecoy set, a dataset that more closely resembles real-world VS scenarios, where AI-based tools obviously outperformed Glide. Additionally, we found that the employed ligand-based postprocessing methods had a weak or even negative impact on optimizing the conformations of docked complexes and enhancing VS performance. Finally, we proposed a hierarchical VS strategy that could efficiently and accurately enrich active molecules in large-scale VS projects. Artificial intelligence (AI)-based docking and scoring methods demonstrate considerable potential for virtual drug screening. Gu et al. go further by assessing the structural rationality of AI-predicted complex conformations from various sources.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Nature Machine Intelligence Multiple-

CiteScore

36.90

自引率

2.10%

发文量

127

期刊介绍： Nature Machine Intelligence is a distinguished publication that presents original research and reviews on various topics in machine learning, robotics, and AI. Our focus extends beyond these fields, exploring their profound impact on other scientific disciplines, as well as societal and industrial aspects. We recognize limitless possibilities wherein machine intelligence can augment human capabilities and knowledge in domains like scientific exploration, healthcare, medical diagnostics, and the creation of safe and sustainable cities, transportation, and agriculture. Simultaneously, we acknowledge the emergence of ethical, social, and legal concerns due to the rapid pace of advancements. To foster interdisciplinary discussions on these far-reaching implications, Nature Machine Intelligence serves as a platform for dialogue facilitated through Comments, News Features, News & Views articles, and Correspondence. Our goal is to encourage a comprehensive examination of these subjects. Similar to all Nature-branded journals, Nature Machine Intelligence operates under the guidance of a team of skilled editors. We adhere to a fair and rigorous peer-review process, ensuring high standards of copy-editing and production, swift publication, and editorial independence.