基于生物活性反馈的主动学习提高虚拟筛选的命中率。

IF 5.7 1区化学 Q2 CHEMISTRY, PHYSICAL

Journal of Chemical Theory and Computation Pub Date : 2025-04-16 DOI:10.1021/acs.jctc.4c01618

Xun Deng,Junlong Liu,Zhike Liu,Jiansheng Wu,Fuli Feng,Jieping Ye,Zheng Wang

{"title":"基于生物活性反馈的主动学习提高虚拟筛选的命中率。","authors":"Xun Deng,Junlong Liu,Zhike Liu,Jiansheng Wu,Fuli Feng,Jieping Ye,Zheng Wang","doi":"10.1021/acs.jctc.4c01618","DOIUrl":null,"url":null,"abstract":"Virtual screening has been widely used to identify potential hit candidates that can bind to the target protein in drug discovery. Contemporary screening methods typically rely on oversimplified scoring functions, frequently yielding one-digit hit rates (or even zero) among top-ranked candidates. The substantial cost of laboratory validation further constrains the exploration of candidate molecules. We find that test-time prediction refinement is almost blank in this area, which means bioactivity feedback in the wet-lab experiments is neglected. Here, we introduce an Active Learning from Bioactivity Feedback (ALBF) framework to enhance the weak hit rate of current virtual screening methods. ALBF spends the budget of wet-lab experiments iteratively and leverages the target-specific bioactivity insights from current wet-lab tests to refine the score results (i.e., rankings). Our framework consists of two components: a novel query strategy that considers the evaluation quality and its overall influence on other top-scored molecules; and an efficient score optimization strategy that propagates the bioactivity feedback to structurally similar molecules. We evaluated ALBF on diverse subsets of the well-known DUD-E and LIT-PCBA benchmarks. Our active learning protocol averagely enhances top-100 hit rates by 60% and 30% on DUD-E and LIT-PCBA with 50 to 200 bioactivity queries on the selected molecules that are deployed in ten rounds. The consistently superior performance demonstrates ALBF's potential to enhance both the accuracy and cost-effectiveness of active learning-based laboratory testing.","PeriodicalId":45,"journal":{"name":"Journal of Chemical Theory and Computation","volume":"26 1","pages":""},"PeriodicalIF":5.7000,"publicationDate":"2025-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Improving the Hit Rates of Virtual Screening by Active Learning from Bioactivity Feedback.\",\"authors\":\"Xun Deng,Junlong Liu,Zhike Liu,Jiansheng Wu,Fuli Feng,Jieping Ye,Zheng Wang\",\"doi\":\"10.1021/acs.jctc.4c01618\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Virtual screening has been widely used to identify potential hit candidates that can bind to the target protein in drug discovery. Contemporary screening methods typically rely on oversimplified scoring functions, frequently yielding one-digit hit rates (or even zero) among top-ranked candidates. The substantial cost of laboratory validation further constrains the exploration of candidate molecules. We find that test-time prediction refinement is almost blank in this area, which means bioactivity feedback in the wet-lab experiments is neglected. Here, we introduce an Active Learning from Bioactivity Feedback (ALBF) framework to enhance the weak hit rate of current virtual screening methods. ALBF spends the budget of wet-lab experiments iteratively and leverages the target-specific bioactivity insights from current wet-lab tests to refine the score results (i.e., rankings). Our framework consists of two components: a novel query strategy that considers the evaluation quality and its overall influence on other top-scored molecules; and an efficient score optimization strategy that propagates the bioactivity feedback to structurally similar molecules. We evaluated ALBF on diverse subsets of the well-known DUD-E and LIT-PCBA benchmarks. Our active learning protocol averagely enhances top-100 hit rates by 60% and 30% on DUD-E and LIT-PCBA with 50 to 200 bioactivity queries on the selected molecules that are deployed in ten rounds. The consistently superior performance demonstrates ALBF's potential to enhance both the accuracy and cost-effectiveness of active learning-based laboratory testing.\",\"PeriodicalId\":45,\"journal\":{\"name\":\"Journal of Chemical Theory and Computation\",\"volume\":\"26 1\",\"pages\":\"\"},\"PeriodicalIF\":5.7000,\"publicationDate\":\"2025-04-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Chemical Theory and Computation\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://doi.org/10.1021/acs.jctc.4c01618\",\"RegionNum\":1,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"CHEMISTRY, PHYSICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Chemical Theory and Computation","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1021/acs.jctc.4c01618","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, PHYSICAL","Score":null,"Total":0}

引用次数: 0

摘要

虚拟筛选在药物开发中被广泛应用于识别能与靶蛋白结合的潜在候选靶点。当代的筛选方法通常依赖于过于简化的评分函数，在排名靠前的候选人中，经常产生一位数的命中率（甚至是零）。实验室验证的巨大成本进一步限制了候选分子的探索。我们发现测试时间的预测细化在这一领域几乎是空白的，这意味着湿实验室实验中的生物活性反馈被忽略了。在此，我们引入生物活性反馈主动学习（ALBF）框架来提高当前虚拟筛选方法的弱命中率。ALBF迭代地花费湿实验室实验的预算，并利用当前湿实验室测试的目标特异性生物活性见解来改进分数结果（即排名）。我们的框架由两个部分组成：一种新的查询策略，它考虑了评价质量及其对其他得分最高的分子的总体影响；以及一种有效的得分优化策略，将生物活性反馈传播到结构相似的分子上。我们在众所周知的ddu - e和LIT-PCBA基准的不同子集上评估了ALBF。我们的主动学习方案平均提高了ddu - e和LIT-PCBA前100名的命中率60%和30%，对所选分子进行50到200次生物活性查询，在10轮中部署。始终如一的卓越性能证明了ALBF在提高基于主动学习的实验室测试的准确性和成本效益方面的潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Improving the Hit Rates of Virtual Screening by Active Learning from Bioactivity Feedback.

Virtual screening has been widely used to identify potential hit candidates that can bind to the target protein in drug discovery. Contemporary screening methods typically rely on oversimplified scoring functions, frequently yielding one-digit hit rates (or even zero) among top-ranked candidates. The substantial cost of laboratory validation further constrains the exploration of candidate molecules. We find that test-time prediction refinement is almost blank in this area, which means bioactivity feedback in the wet-lab experiments is neglected. Here, we introduce an Active Learning from Bioactivity Feedback (ALBF) framework to enhance the weak hit rate of current virtual screening methods. ALBF spends the budget of wet-lab experiments iteratively and leverages the target-specific bioactivity insights from current wet-lab tests to refine the score results (i.e., rankings). Our framework consists of two components: a novel query strategy that considers the evaluation quality and its overall influence on other top-scored molecules; and an efficient score optimization strategy that propagates the bioactivity feedback to structurally similar molecules. We evaluated ALBF on diverse subsets of the well-known DUD-E and LIT-PCBA benchmarks. Our active learning protocol averagely enhances top-100 hit rates by 60% and 30% on DUD-E and LIT-PCBA with 50 to 200 bioactivity queries on the selected molecules that are deployed in ten rounds. The consistently superior performance demonstrates ALBF's potential to enhance both the accuracy and cost-effectiveness of active learning-based laboratory testing.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Chemical Theory and Computation 化学-物理：原子、分子和化学物理

CiteScore

9.90

自引率

16.40%

发文量

568

审稿时长

1 months

期刊介绍： The Journal of Chemical Theory and Computation invites new and original contributions with the understanding that, if accepted, they will not be published elsewhere. Papers reporting new theories, methodology, and/or important applications in quantum electronic structure, molecular dynamics, and statistical mechanics are appropriate for submission to this Journal. Specific topics include advances in or applications of ab initio quantum mechanics, density functional theory, design and properties of new materials, surface science, Monte Carlo simulations, solvation models, QM/MM calculations, biomolecular structure prediction, and molecular dynamics in the broadest sense including gas-phase dynamics, ab initio dynamics, biomolecular dynamics, and protein folding. The Journal does not consider papers that are straightforward applications of known methods including DFT and molecular dynamics. The Journal favors submissions that include advances in theory or methodology with applications to compelling problems.