Building a ML-based QSAR model for predicting the bioactivity of therapeutically active drug class with imidazole scaffold

Komal Singh , Irina Ghosh , Venkatesan Jayaprakash , Sudeepan Jayapalan
{"title":"Building a ML-based QSAR model for predicting the bioactivity of therapeutically active drug class with imidazole scaffold","authors":"Komal Singh ,&nbsp;Irina Ghosh ,&nbsp;Venkatesan Jayaprakash ,&nbsp;Sudeepan Jayapalan","doi":"10.1016/j.ejmcr.2024.100148","DOIUrl":null,"url":null,"abstract":"<div><p>Human immunodeficiency virus, a retrovirus, causes AIDS, a chronic immune system disease. HIV interferes with the ability of our body to combat disease and infection by weakening our immune system. An essential enzyme necessary for HIV replication is reverse transcriptase (RT). RT inhibitors (RTIs) are a class of antiretroviral drugs that target HIV's RT enzyme, blocking its ability to convert viral RNA into DNA. The RT-1 enzyme has been found to be inhibited by imidazole. It attaches to the RT-1 enzyme's active site and prevents it from performing its usual activity. As a result, viral replication is inhibited, which can eventually aid in slowing the course of HIV and other retroviral diseases. A computational tool allows researchers to simulate and analyze the drug's behaviour in a virtual environment, providing valuable insights into its pharmacological properties, efficacy, and safety. QSAR modelling uses machine learning methods to create predictive models from datasets of chemical substances and the accompanying biological activity. Here, a comparative analysis of the model performances by four different algorithms for the Imidazole scaffold are reported. The algorithms of Support Vector Regression (SVR), Random Forest Regression (RFR), Decision Tree Regression (DTR) and Hist Gradient Boosting Regression (HGBR) have given promising results with the R<sup>2</sup> value of 0.905, 0.993, 0.688 and 0.921 respectively for the train sets and for the test set 0.843, 0.977, 0.567 and 0.880. The best performed RFR model have been validated using developed RFR codes for randomly selected compounds and it shows the error percentage of about 0.151% only. From the R<sup>2</sup> values, it is observed that the RFR and HGBR models show a better fit with the variables compared to the other models thereby making them the potential models for predicting the activity of novel anti-viral compounds.</p></div>","PeriodicalId":12015,"journal":{"name":"European Journal of Medicinal Chemistry Reports","volume":"11 ","pages":"Article 100148"},"PeriodicalIF":0.0000,"publicationDate":"2024-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772417424000207/pdfft?md5=f8c0587cac96b9677a261126b3c259c5&pid=1-s2.0-S2772417424000207-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Journal of Medicinal Chemistry Reports","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772417424000207","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Human immunodeficiency virus, a retrovirus, causes AIDS, a chronic immune system disease. HIV interferes with the ability of our body to combat disease and infection by weakening our immune system. An essential enzyme necessary for HIV replication is reverse transcriptase (RT). RT inhibitors (RTIs) are a class of antiretroviral drugs that target HIV's RT enzyme, blocking its ability to convert viral RNA into DNA. The RT-1 enzyme has been found to be inhibited by imidazole. It attaches to the RT-1 enzyme's active site and prevents it from performing its usual activity. As a result, viral replication is inhibited, which can eventually aid in slowing the course of HIV and other retroviral diseases. A computational tool allows researchers to simulate and analyze the drug's behaviour in a virtual environment, providing valuable insights into its pharmacological properties, efficacy, and safety. QSAR modelling uses machine learning methods to create predictive models from datasets of chemical substances and the accompanying biological activity. Here, a comparative analysis of the model performances by four different algorithms for the Imidazole scaffold are reported. The algorithms of Support Vector Regression (SVR), Random Forest Regression (RFR), Decision Tree Regression (DTR) and Hist Gradient Boosting Regression (HGBR) have given promising results with the R2 value of 0.905, 0.993, 0.688 and 0.921 respectively for the train sets and for the test set 0.843, 0.977, 0.567 and 0.880. The best performed RFR model have been validated using developed RFR codes for randomly selected compounds and it shows the error percentage of about 0.151% only. From the R2 values, it is observed that the RFR and HGBR models show a better fit with the variables compared to the other models thereby making them the potential models for predicting the activity of novel anti-viral compounds.

建立基于 ML 的 QSAR 模型,预测咪唑类治疗活性药物的生物活性
人类免疫缺陷病毒是一种逆转录病毒,可导致艾滋病这种慢性免疫系统疾病。艾滋病毒会削弱人体的免疫系统,从而干扰人体抵抗疾病和感染的能力。逆转录酶(RT)是艾滋病毒复制所必需的一种重要酶。RT 抑制剂(RTIs)是一类抗逆转录病毒药物,以 HIV 的 RT 酶为靶点,阻断其将病毒 RNA 转化为 DNA 的能力。已发现咪唑可抑制 RT-1 酶。它附着在 RT-1 酶的活性位点上,使其无法进行通常的活动。因此,病毒复制受到抑制,最终有助于减缓艾滋病毒和其他逆转录病毒疾病的进程。通过计算工具,研究人员可以在虚拟环境中模拟和分析药物的行为,为药物的药理特性、疗效和安全性提供有价值的见解。QSAR 建模使用机器学习方法,从化学物质数据集和伴随的生物活性中创建预测模型。本文报告了四种不同算法对咪唑支架模型性能的比较分析。支持向量回归 (SVR)、随机森林回归 (RFR)、决策树回归 (DTR) 和直方梯度提升回归 (HGBR) 等算法取得了很好的结果,训练集的 R2 值分别为 0.905、0.993、0.688 和 0.921,测试集的 R2 值分别为 0.843、0.977、0.567 和 0.880。使用开发的 RFR 代码对随机选择的化合物进行了验证,结果表明最佳 RFR 模型的误差率仅为 0.151%。从 R2 值可以看出,与其他模型相比,RFR 和 HGBR 模型与变量的拟合度更高,因此成为预测新型抗病毒化合物活性的潜在模型。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
4.50
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信