Xun Deng,Junlong Liu,Zhike Liu,Jiansheng Wu,Fuli Feng,Jieping Ye,Zheng Wang
{"title":"基于生物活性反馈的主动学习提高虚拟筛选的命中率。","authors":"Xun Deng,Junlong Liu,Zhike Liu,Jiansheng Wu,Fuli Feng,Jieping Ye,Zheng Wang","doi":"10.1021/acs.jctc.4c01618","DOIUrl":null,"url":null,"abstract":"Virtual screening has been widely used to identify potential hit candidates that can bind to the target protein in drug discovery. Contemporary screening methods typically rely on oversimplified scoring functions, frequently yielding one-digit hit rates (or even zero) among top-ranked candidates. The substantial cost of laboratory validation further constrains the exploration of candidate molecules. We find that test-time prediction refinement is almost blank in this area, which means bioactivity feedback in the wet-lab experiments is neglected. Here, we introduce an Active Learning from Bioactivity Feedback (ALBF) framework to enhance the weak hit rate of current virtual screening methods. ALBF spends the budget of wet-lab experiments iteratively and leverages the target-specific bioactivity insights from current wet-lab tests to refine the score results (i.e., rankings). Our framework consists of two components: a novel query strategy that considers the evaluation quality and its overall influence on other top-scored molecules; and an efficient score optimization strategy that propagates the bioactivity feedback to structurally similar molecules. We evaluated ALBF on diverse subsets of the well-known DUD-E and LIT-PCBA benchmarks. Our active learning protocol averagely enhances top-100 hit rates by 60% and 30% on DUD-E and LIT-PCBA with 50 to 200 bioactivity queries on the selected molecules that are deployed in ten rounds. The consistently superior performance demonstrates ALBF's potential to enhance both the accuracy and cost-effectiveness of active learning-based laboratory testing.","PeriodicalId":45,"journal":{"name":"Journal of Chemical Theory and Computation","volume":"26 1","pages":""},"PeriodicalIF":5.7000,"publicationDate":"2025-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Improving the Hit Rates of Virtual Screening by Active Learning from Bioactivity Feedback.\",\"authors\":\"Xun Deng,Junlong Liu,Zhike Liu,Jiansheng Wu,Fuli Feng,Jieping Ye,Zheng Wang\",\"doi\":\"10.1021/acs.jctc.4c01618\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Virtual screening has been widely used to identify potential hit candidates that can bind to the target protein in drug discovery. Contemporary screening methods typically rely on oversimplified scoring functions, frequently yielding one-digit hit rates (or even zero) among top-ranked candidates. The substantial cost of laboratory validation further constrains the exploration of candidate molecules. We find that test-time prediction refinement is almost blank in this area, which means bioactivity feedback in the wet-lab experiments is neglected. Here, we introduce an Active Learning from Bioactivity Feedback (ALBF) framework to enhance the weak hit rate of current virtual screening methods. ALBF spends the budget of wet-lab experiments iteratively and leverages the target-specific bioactivity insights from current wet-lab tests to refine the score results (i.e., rankings). Our framework consists of two components: a novel query strategy that considers the evaluation quality and its overall influence on other top-scored molecules; and an efficient score optimization strategy that propagates the bioactivity feedback to structurally similar molecules. We evaluated ALBF on diverse subsets of the well-known DUD-E and LIT-PCBA benchmarks. Our active learning protocol averagely enhances top-100 hit rates by 60% and 30% on DUD-E and LIT-PCBA with 50 to 200 bioactivity queries on the selected molecules that are deployed in ten rounds. The consistently superior performance demonstrates ALBF's potential to enhance both the accuracy and cost-effectiveness of active learning-based laboratory testing.\",\"PeriodicalId\":45,\"journal\":{\"name\":\"Journal of Chemical Theory and Computation\",\"volume\":\"26 1\",\"pages\":\"\"},\"PeriodicalIF\":5.7000,\"publicationDate\":\"2025-04-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Chemical Theory and Computation\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://doi.org/10.1021/acs.jctc.4c01618\",\"RegionNum\":1,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"CHEMISTRY, PHYSICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Chemical Theory and Computation","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1021/acs.jctc.4c01618","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, PHYSICAL","Score":null,"Total":0}
Improving the Hit Rates of Virtual Screening by Active Learning from Bioactivity Feedback.
Virtual screening has been widely used to identify potential hit candidates that can bind to the target protein in drug discovery. Contemporary screening methods typically rely on oversimplified scoring functions, frequently yielding one-digit hit rates (or even zero) among top-ranked candidates. The substantial cost of laboratory validation further constrains the exploration of candidate molecules. We find that test-time prediction refinement is almost blank in this area, which means bioactivity feedback in the wet-lab experiments is neglected. Here, we introduce an Active Learning from Bioactivity Feedback (ALBF) framework to enhance the weak hit rate of current virtual screening methods. ALBF spends the budget of wet-lab experiments iteratively and leverages the target-specific bioactivity insights from current wet-lab tests to refine the score results (i.e., rankings). Our framework consists of two components: a novel query strategy that considers the evaluation quality and its overall influence on other top-scored molecules; and an efficient score optimization strategy that propagates the bioactivity feedback to structurally similar molecules. We evaluated ALBF on diverse subsets of the well-known DUD-E and LIT-PCBA benchmarks. Our active learning protocol averagely enhances top-100 hit rates by 60% and 30% on DUD-E and LIT-PCBA with 50 to 200 bioactivity queries on the selected molecules that are deployed in ten rounds. The consistently superior performance demonstrates ALBF's potential to enhance both the accuracy and cost-effectiveness of active learning-based laboratory testing.
期刊介绍:
The Journal of Chemical Theory and Computation invites new and original contributions with the understanding that, if accepted, they will not be published elsewhere. Papers reporting new theories, methodology, and/or important applications in quantum electronic structure, molecular dynamics, and statistical mechanics are appropriate for submission to this Journal. Specific topics include advances in or applications of ab initio quantum mechanics, density functional theory, design and properties of new materials, surface science, Monte Carlo simulations, solvation models, QM/MM calculations, biomolecular structure prediction, and molecular dynamics in the broadest sense including gas-phase dynamics, ab initio dynamics, biomolecular dynamics, and protein folding. The Journal does not consider papers that are straightforward applications of known methods including DFT and molecular dynamics. The Journal favors submissions that include advances in theory or methodology with applications to compelling problems.