Ligand Based Virtual Screening of Molecular Compounds in Drug Discovery Using GCAN Fingerprint and Ensemble Machine Learning Algorithm

IF 2.2 4区 计算机科学 Q2 Computer Science
R. Ani, O. S. Deepa, B. R. Manju
{"title":"Ligand Based Virtual Screening of Molecular Compounds in Drug Discovery Using GCAN Fingerprint and Ensemble Machine Learning Algorithm","authors":"R. Ani, O. S. Deepa, B. R. Manju","doi":"10.32604/csse.2023.033807","DOIUrl":null,"url":null,"abstract":"The drug development process takes a long time since it requires sorting through a large number of inactive compounds from a large collection of compounds chosen for study and choosing just the most pertinent compounds that can bind to a disease protein. The use of virtual screening in pharmaceutical research is growing in popularity. During the early phases of medication research and development, it is crucial. Chemical compound searches are now more narrowly targeted. Because the databases contain more and more ligands, this method needs to be quick and exact. Neural network fingerprints were created more effectively than the well-known Extended Connectivity Fingerprint (ECFP). Only the largest sub-graph is taken into consideration to learn the representation, despite the fact that the conventional graph network generates a better-encoded fingerprint. When using the average or maximum pooling layer, it also contains unrelated data. This article suggested the Graph Convolutional Attention Network (GCAN), a graph neural network with an attention mechanism, to address these problems. Additionally, it makes the nodes or sub-graphs that are used to create the molecular fingerprint more significant. The generated fingerprint is used to classify drugs using ensemble learning. As base classifiers, ensemble stacking is applied to Support Vector Machines (SVM), Random Forest, Nave Bayes, Decision Trees, AdaBoost, and Gradient Boosting. When compared to existing models, the proposed GCAN fingerprint with an ensemble model achieves relatively high accuracy, sensitivity, specificity, and area under the curve. Additionally, it is revealed that our ensemble learning with generated molecular fingerprint yields 91% accuracy, outperforming earlier approaches.","PeriodicalId":50634,"journal":{"name":"Computer Systems Science and Engineering","volume":"11 1","pages":"0"},"PeriodicalIF":2.2000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Systems Science and Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.32604/csse.2023.033807","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Computer Science","Score":null,"Total":0}
引用次数: 0

Abstract

The drug development process takes a long time since it requires sorting through a large number of inactive compounds from a large collection of compounds chosen for study and choosing just the most pertinent compounds that can bind to a disease protein. The use of virtual screening in pharmaceutical research is growing in popularity. During the early phases of medication research and development, it is crucial. Chemical compound searches are now more narrowly targeted. Because the databases contain more and more ligands, this method needs to be quick and exact. Neural network fingerprints were created more effectively than the well-known Extended Connectivity Fingerprint (ECFP). Only the largest sub-graph is taken into consideration to learn the representation, despite the fact that the conventional graph network generates a better-encoded fingerprint. When using the average or maximum pooling layer, it also contains unrelated data. This article suggested the Graph Convolutional Attention Network (GCAN), a graph neural network with an attention mechanism, to address these problems. Additionally, it makes the nodes or sub-graphs that are used to create the molecular fingerprint more significant. The generated fingerprint is used to classify drugs using ensemble learning. As base classifiers, ensemble stacking is applied to Support Vector Machines (SVM), Random Forest, Nave Bayes, Decision Trees, AdaBoost, and Gradient Boosting. When compared to existing models, the proposed GCAN fingerprint with an ensemble model achieves relatively high accuracy, sensitivity, specificity, and area under the curve. Additionally, it is revealed that our ensemble learning with generated molecular fingerprint yields 91% accuracy, outperforming earlier approaches.
基于GCAN指纹和集成机器学习算法的药物发现中分子化合物配体虚拟筛选
药物开发过程需要很长时间,因为它需要从大量选择用于研究的化合物中筛选大量无活性化合物,并选择能够与疾病蛋白质结合的最相关的化合物。虚拟筛选在药物研究中的应用日益普及。在药物研究和开发的早期阶段,这是至关重要的。化学化合物的搜索现在更有针对性。由于数据库中包含的配体越来越多,该方法需要快速准确。神经网络指纹的创建比众所周知的扩展连接指纹(ECFP)更有效。尽管传统的图网络生成了更好的编码指纹,但它只考虑最大的子图来学习表征。当使用平均或最大池化层时,它还包含不相关的数据。本文提出了一种具有注意机制的图神经网络——图卷积注意网络(GCAN)来解决这些问题。此外,它使用于创建分子指纹的节点或子图更加重要。生成的指纹用于使用集成学习对药物进行分类。作为基本分类器,集成叠加被应用于支持向量机(SVM)、随机森林、朴素贝叶斯、决策树、AdaBoost和梯度增强。与现有模型相比,集成模型的GCAN指纹具有较高的精度、灵敏度、特异度和曲线下面积。此外,我们的集成学习与生成的分子指纹的准确率达到91%,优于早期的方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Computer Systems Science and Engineering
Computer Systems Science and Engineering 工程技术-计算机:理论方法
CiteScore
3.10
自引率
13.60%
发文量
308
审稿时长
>12 weeks
期刊介绍: The journal is devoted to the publication of high quality papers on theoretical developments in computer systems science, and their applications in computer systems engineering. Original research papers, state-of-the-art reviews and technical notes are invited for publication. All papers will be refereed by acknowledged experts in the field, and may be (i) accepted without change, (ii) require amendment and subsequent re-refereeing, or (iii) be rejected on the grounds of either relevance or content. The submission of a paper implies that, if accepted for publication, it will not be published elsewhere in the same form, in any language, without the prior consent of the Publisher.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信