开发和实验验证用于预测新型抗疟药的机器学习模型。

IF 4.3 2区 化学 Q2 CHEMISTRY, MULTIDISCIPLINARY
Mukul Kore, Dimple Acharya, Lakshya Sharma, Shruthi Sridhar Vembar, Sandeep Sundriyal
{"title":"开发和实验验证用于预测新型抗疟药的机器学习模型。","authors":"Mukul Kore,&nbsp;Dimple Acharya,&nbsp;Lakshya Sharma,&nbsp;Shruthi Sridhar Vembar,&nbsp;Sandeep Sundriyal","doi":"10.1186/s13065-025-01395-4","DOIUrl":null,"url":null,"abstract":"<div><p>A large set of antimalarial molecules (<i>N</i> ~ 15k) was employed from ChEMBL to build a robust random forest (RF) model for the prediction of antiplasmodial activity. Rather than depending on high throughput screening (HTS) data, molecules tested at multiple doses against blood stages of <i>Plasmodium falciparum</i> were used for model development. The open-access and code-free KNIME platform was used to develop a workflow to train the model on 80% of data (<i>N</i> ~ 12k). The hyperparameter values were optimized to achieve the highest predictive accuracy with nine different molecular fingerprints (MFPs), among which Avalon MFPs (referred to as RF-1) provided the best results. RF-1 displayed 91.7% accuracy, 93.5% precision, 88.4% sensitivity and 97.3% area under the Receiver operating characteristic (AUROC) for the remaining 20% test set. The predictive performance of RF-1 was comparable to that of the malaria inhibitor prediction platform (MAIP), a recently reported consensus model based on a large proprietary dataset. However, hits obtained from RF-1 and MAIP from a commercial library did not overlap, suggesting that these two models are complementary. Finally, RF-1 was used to screen small molecules under clinical investigations for repurposing. Six molecules were purchased, out of which two human kinase inhibitors were identified to have single-digit micromolar antiplasmodial activity. One of the hits (compound <b>1</b>) was a potent inhibitor of <i>β</i>-hematin, suggesting the involvement of parasite hemozoin (Hz) synthesis in the parasiticidal effect. The training and test sets are provided as supplementary information, allowing others to reproduce this work.</p></div>","PeriodicalId":496,"journal":{"name":"BMC Chemistry","volume":"19 1","pages":""},"PeriodicalIF":4.3000,"publicationDate":"2025-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11783816/pdf/","citationCount":"0","resultStr":"{\"title\":\"Development and experimental validation of a machine learning model for the prediction of new antimalarials\",\"authors\":\"Mukul Kore,&nbsp;Dimple Acharya,&nbsp;Lakshya Sharma,&nbsp;Shruthi Sridhar Vembar,&nbsp;Sandeep Sundriyal\",\"doi\":\"10.1186/s13065-025-01395-4\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>A large set of antimalarial molecules (<i>N</i> ~ 15k) was employed from ChEMBL to build a robust random forest (RF) model for the prediction of antiplasmodial activity. Rather than depending on high throughput screening (HTS) data, molecules tested at multiple doses against blood stages of <i>Plasmodium falciparum</i> were used for model development. The open-access and code-free KNIME platform was used to develop a workflow to train the model on 80% of data (<i>N</i> ~ 12k). The hyperparameter values were optimized to achieve the highest predictive accuracy with nine different molecular fingerprints (MFPs), among which Avalon MFPs (referred to as RF-1) provided the best results. RF-1 displayed 91.7% accuracy, 93.5% precision, 88.4% sensitivity and 97.3% area under the Receiver operating characteristic (AUROC) for the remaining 20% test set. The predictive performance of RF-1 was comparable to that of the malaria inhibitor prediction platform (MAIP), a recently reported consensus model based on a large proprietary dataset. However, hits obtained from RF-1 and MAIP from a commercial library did not overlap, suggesting that these two models are complementary. Finally, RF-1 was used to screen small molecules under clinical investigations for repurposing. Six molecules were purchased, out of which two human kinase inhibitors were identified to have single-digit micromolar antiplasmodial activity. One of the hits (compound <b>1</b>) was a potent inhibitor of <i>β</i>-hematin, suggesting the involvement of parasite hemozoin (Hz) synthesis in the parasiticidal effect. The training and test sets are provided as supplementary information, allowing others to reproduce this work.</p></div>\",\"PeriodicalId\":496,\"journal\":{\"name\":\"BMC Chemistry\",\"volume\":\"19 1\",\"pages\":\"\"},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2025-01-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11783816/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"BMC Chemistry\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://link.springer.com/article/10.1186/s13065-025-01395-4\",\"RegionNum\":2,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"CHEMISTRY, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Chemistry","FirstCategoryId":"92","ListUrlMain":"https://link.springer.com/article/10.1186/s13065-025-01395-4","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

摘要

利用ChEMBL中大量的抗疟分子(N ~ 15k)构建了鲁棒随机森林(RF)模型,用于预测抗疟原虫活性。与依赖高通量筛选(HTS)数据不同的是,模型开发使用了针对恶性疟原虫血液阶段的多剂量分子测试。采用开放存取、无代码的KNIME平台,开发了在80%的数据(N ~ 12k)上训练模型的工作流程。对9种不同分子指纹图谱(mfp)的超参数值进行优化,获得最高的预测精度,其中Avalon mfp (RF-1)的预测效果最好。RF-1的准确度为91.7%,精密度为93.5%,灵敏度为88.4%,其余20%的测试集在Receiver operating characteristic (AUROC)下的面积为97.3%。RF-1的预测性能与疟疾抑制剂预测平台(MAIP)相当,后者是最近报道的基于大型专有数据集的共识模型。然而,从RF-1和来自商业文库的maep中获得的点击率没有重叠,这表明这两个模型是互补的。最后,RF-1被用于筛选临床研究中的小分子以重新利用。我们购买了六种分子,其中两种人类激酶抑制剂被鉴定为具有个位数微摩尔抗疟原虫活性。其中一个hit(化合物1)是β-血红素的有效抑制剂,表明寄生虫血色素(Hz)的合成参与了杀寄生作用。训练集和测试集作为补充信息提供,允许其他人复制此工作。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Development and experimental validation of a machine learning model for the prediction of new antimalarials

A large set of antimalarial molecules (N ~ 15k) was employed from ChEMBL to build a robust random forest (RF) model for the prediction of antiplasmodial activity. Rather than depending on high throughput screening (HTS) data, molecules tested at multiple doses against blood stages of Plasmodium falciparum were used for model development. The open-access and code-free KNIME platform was used to develop a workflow to train the model on 80% of data (N ~ 12k). The hyperparameter values were optimized to achieve the highest predictive accuracy with nine different molecular fingerprints (MFPs), among which Avalon MFPs (referred to as RF-1) provided the best results. RF-1 displayed 91.7% accuracy, 93.5% precision, 88.4% sensitivity and 97.3% area under the Receiver operating characteristic (AUROC) for the remaining 20% test set. The predictive performance of RF-1 was comparable to that of the malaria inhibitor prediction platform (MAIP), a recently reported consensus model based on a large proprietary dataset. However, hits obtained from RF-1 and MAIP from a commercial library did not overlap, suggesting that these two models are complementary. Finally, RF-1 was used to screen small molecules under clinical investigations for repurposing. Six molecules were purchased, out of which two human kinase inhibitors were identified to have single-digit micromolar antiplasmodial activity. One of the hits (compound 1) was a potent inhibitor of β-hematin, suggesting the involvement of parasite hemozoin (Hz) synthesis in the parasiticidal effect. The training and test sets are provided as supplementary information, allowing others to reproduce this work.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
BMC Chemistry
BMC Chemistry Chemistry-General Chemistry
CiteScore
5.30
自引率
2.20%
发文量
92
审稿时长
27 weeks
期刊介绍: BMC Chemistry, formerly known as Chemistry Central Journal, is now part of the BMC series journals family. Chemistry Central Journal has served the chemistry community as a trusted open access resource for more than 10 years – and we are delighted to announce the next step on its journey. In January 2019 the journal has been renamed BMC Chemistry and now strengthens the BMC series footprint in the physical sciences by publishing quality articles and by pushing the boundaries of open chemistry.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信