建立新冠肺炎药物再利用的SARS-CoV-2主要蛋白酶结合预测随机森林模型。

IF 4.6 Q2 MATERIALS SCIENCE, BIOMATERIALS
ACS Applied Bio Materials Pub Date : 2023-11-01 Epub Date: 2023-11-24 DOI:10.1177/15353702231209413
Jie Liu, Liang Xu, Wenjing Guo, Zoe Li, Md Kamrul Hasan Khan, Weigong Ge, Tucker A Patterson, Huixiao Hong
{"title":"建立新冠肺炎药物再利用的SARS-CoV-2主要蛋白酶结合预测随机森林模型。","authors":"Jie Liu, Liang Xu, Wenjing Guo, Zoe Li, Md Kamrul Hasan Khan, Weigong Ge, Tucker A Patterson, Huixiao Hong","doi":"10.1177/15353702231209413","DOIUrl":null,"url":null,"abstract":"<p><p>The coronavirus disease 2019 (COVID-19) global pandemic resulted in millions of people becoming infected with the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) virus and close to seven million deaths worldwide. It is essential to further explore and design effective COVID-19 treatment drugs that target the main protease of SARS-CoV-2, a major target for COVID-19 drugs. In this study, machine learning was applied for predicting the SARS-CoV-2 main protease binding of Food and Drug Administration (FDA)-approved drugs to assist in the identification of potential repurposing candidates for COVID-19 treatment. Ligands bound to the SARS-CoV-2 main protease in the Protein Data Bank and compounds experimentally tested in SARS-CoV-2 main protease binding assays in the literature were curated. These chemicals were divided into training (516 chemicals) and testing (360 chemicals) data sets. To identify SARS-CoV-2 main protease binders as potential candidates for repurposing to treat COVID-19, 1188 FDA-approved drugs from the Liver Toxicity Knowledge Base were obtained. A random forest algorithm was used for constructing predictive models based on molecular descriptors calculated using Mold2 software. Model performance was evaluated using 100 iterations of fivefold cross-validations which resulted in 78.8% balanced accuracy. The random forest model that was constructed from the whole training dataset was used to predict SARS-CoV-2 main protease binding on the testing set and the FDA-approved drugs. Model applicability domain and prediction confidence on drugs predicted as the main protease binders discovered 10 FDA-approved drugs as potential candidates for repurposing to treat COVID-19. Our results demonstrate that machine learning is an efficient method for drug repurposing and, thus, may accelerate drug development targeting SARS-CoV-2.</p>","PeriodicalId":2,"journal":{"name":"ACS Applied Bio Materials","volume":null,"pages":null},"PeriodicalIF":4.6000,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10798185/pdf/","citationCount":"0","resultStr":"{\"title\":\"Developing a SARS-CoV-2 main protease binding prediction random forest model for drug repurposing for COVID-19 treatment.\",\"authors\":\"Jie Liu, Liang Xu, Wenjing Guo, Zoe Li, Md Kamrul Hasan Khan, Weigong Ge, Tucker A Patterson, Huixiao Hong\",\"doi\":\"10.1177/15353702231209413\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>The coronavirus disease 2019 (COVID-19) global pandemic resulted in millions of people becoming infected with the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) virus and close to seven million deaths worldwide. It is essential to further explore and design effective COVID-19 treatment drugs that target the main protease of SARS-CoV-2, a major target for COVID-19 drugs. In this study, machine learning was applied for predicting the SARS-CoV-2 main protease binding of Food and Drug Administration (FDA)-approved drugs to assist in the identification of potential repurposing candidates for COVID-19 treatment. Ligands bound to the SARS-CoV-2 main protease in the Protein Data Bank and compounds experimentally tested in SARS-CoV-2 main protease binding assays in the literature were curated. These chemicals were divided into training (516 chemicals) and testing (360 chemicals) data sets. To identify SARS-CoV-2 main protease binders as potential candidates for repurposing to treat COVID-19, 1188 FDA-approved drugs from the Liver Toxicity Knowledge Base were obtained. A random forest algorithm was used for constructing predictive models based on molecular descriptors calculated using Mold2 software. Model performance was evaluated using 100 iterations of fivefold cross-validations which resulted in 78.8% balanced accuracy. The random forest model that was constructed from the whole training dataset was used to predict SARS-CoV-2 main protease binding on the testing set and the FDA-approved drugs. Model applicability domain and prediction confidence on drugs predicted as the main protease binders discovered 10 FDA-approved drugs as potential candidates for repurposing to treat COVID-19. Our results demonstrate that machine learning is an efficient method for drug repurposing and, thus, may accelerate drug development targeting SARS-CoV-2.</p>\",\"PeriodicalId\":2,\"journal\":{\"name\":\"ACS Applied Bio Materials\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":4.6000,\"publicationDate\":\"2023-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10798185/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACS Applied Bio Materials\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1177/15353702231209413\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2023/11/24 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q2\",\"JCRName\":\"MATERIALS SCIENCE, BIOMATERIALS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Bio Materials","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1177/15353702231209413","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/11/24 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"MATERIALS SCIENCE, BIOMATERIALS","Score":null,"Total":0}
引用次数: 0

摘要

2019年冠状病毒病(COVID-19)全球大流行导致数百万人感染了严重急性呼吸综合征冠状病毒2 (SARS-CoV-2)病毒,全球近700万人死亡。因此,有必要进一步探索和设计针对SARS-CoV-2主要蛋白酶的有效COVID-19治疗药物,这是COVID-19药物的主要靶点。在这项研究中,机器学习被应用于预测美国食品和药物管理局(FDA)批准的药物与SARS-CoV-2主要蛋白酶的结合,以帮助确定潜在的可用于治疗COVID-19的候选药物。筛选蛋白质数据库中与SARS-CoV-2主要蛋白酶结合的配体,以及文献中SARS-CoV-2主要蛋白酶结合试验中实验检测的化合物。这些化学品被分为训练(516种化学品)和测试(360种化学品)数据集。为了确定SARS-CoV-2主要蛋白酶结合物作为重新用于治疗COVID-19的潜在候选物,从肝毒性知识库中获得了1188种fda批准的药物。基于Mold2软件计算的分子描述符,采用随机森林算法构建预测模型。使用100次五重交叉验证来评估模型性能,结果达到78.8%的平衡精度。利用整个训练数据集构建的随机森林模型预测了SARS-CoV-2主要蛋白酶在测试集和fda批准的药物上的结合。预测作为主要蛋白酶结合物的药物的模型适用性域和预测置信度发现了10种fda批准的药物作为治疗COVID-19的潜在候选药物。我们的研究结果表明,机器学习是一种有效的药物再利用方法,因此可能会加速针对SARS-CoV-2的药物开发。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Developing a SARS-CoV-2 main protease binding prediction random forest model for drug repurposing for COVID-19 treatment.

The coronavirus disease 2019 (COVID-19) global pandemic resulted in millions of people becoming infected with the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) virus and close to seven million deaths worldwide. It is essential to further explore and design effective COVID-19 treatment drugs that target the main protease of SARS-CoV-2, a major target for COVID-19 drugs. In this study, machine learning was applied for predicting the SARS-CoV-2 main protease binding of Food and Drug Administration (FDA)-approved drugs to assist in the identification of potential repurposing candidates for COVID-19 treatment. Ligands bound to the SARS-CoV-2 main protease in the Protein Data Bank and compounds experimentally tested in SARS-CoV-2 main protease binding assays in the literature were curated. These chemicals were divided into training (516 chemicals) and testing (360 chemicals) data sets. To identify SARS-CoV-2 main protease binders as potential candidates for repurposing to treat COVID-19, 1188 FDA-approved drugs from the Liver Toxicity Knowledge Base were obtained. A random forest algorithm was used for constructing predictive models based on molecular descriptors calculated using Mold2 software. Model performance was evaluated using 100 iterations of fivefold cross-validations which resulted in 78.8% balanced accuracy. The random forest model that was constructed from the whole training dataset was used to predict SARS-CoV-2 main protease binding on the testing set and the FDA-approved drugs. Model applicability domain and prediction confidence on drugs predicted as the main protease binders discovered 10 FDA-approved drugs as potential candidates for repurposing to treat COVID-19. Our results demonstrate that machine learning is an efficient method for drug repurposing and, thus, may accelerate drug development targeting SARS-CoV-2.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
ACS Applied Bio Materials
ACS Applied Bio Materials Chemistry-Chemistry (all)
CiteScore
9.40
自引率
2.10%
发文量
464
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信