Hyb_SEnc:基于混合特征向量和堆叠集合学习的抗结核肽预测器

IF 3.6 3区 生物学 Q2 BIOCHEMICAL RESEARCH METHODS
Xiuhao Fu, Hao Duan, Xiaofeng Zang, Chunling Liu, Xingfeng Li, Qingchen Zhang, Zilong Zhang, Quan Zou, Feifei Cui
{"title":"Hyb_SEnc:基于混合特征向量和堆叠集合学习的抗结核肽预测器","authors":"Xiuhao Fu, Hao Duan, Xiaofeng Zang, Chunling Liu, Xingfeng Li, Qingchen Zhang, Zilong Zhang, Quan Zou, Feifei Cui","doi":"10.1109/TCBB.2024.3425644","DOIUrl":null,"url":null,"abstract":"<p><p>Tuberculosis has plagued mankind since ancient times, and the struggle between humans and tuberculosis continues. Mycobacterium tuberculosis is the leading cause of tuberculosis, infecting nearly one-third of the world's population. The rise of peptide drugs has created a new direction in the treatment of tuberculosis. Therefore, for the treatment of tuberculosis, the prediction of anti-tuberculosis peptides is crucial.This paper proposes an anti-tuberculosis peptide prediction method based on hybrid features and stacked ensemble learning. First, a random forest (RF) and extremely randomized tree (ERT) are selected as first-level learning of stacked ensembles. Then, the five best-performing feature encoding methods are selected to obtain the hybrid feature vector, and then the decision tree and recursive feature elimination (DT-RFE) are used to refine the hybrid feature vector. After selection, the optimal feature subset is used as the input of the stacked ensemble model. At the same time, logistic regression (LR) is used as a stacked ensemble secondary learner to build the final stacked ensemble model Hyb_SEnc. The prediction accuracy of Hyb_SEnc achieved 94.68% and 95.74% on the independent test sets of AntiTb_MD and AntiTb_RD, respectively. In addition, we provide a user-friendly Web server (http://www.bioailab. com/Hyb_SEnc). The source code is freely available at https://github.com/fxh1001/Hyb_SEnc.</p>","PeriodicalId":13344,"journal":{"name":"IEEE/ACM Transactions on Computational Biology and Bioinformatics","volume":"PP ","pages":""},"PeriodicalIF":3.6000,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Hyb_SEnc: An Antituberculosis Peptide Predictor Based on a Hybrid Feature Vector and Stacked Ensemble Learning.\",\"authors\":\"Xiuhao Fu, Hao Duan, Xiaofeng Zang, Chunling Liu, Xingfeng Li, Qingchen Zhang, Zilong Zhang, Quan Zou, Feifei Cui\",\"doi\":\"10.1109/TCBB.2024.3425644\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Tuberculosis has plagued mankind since ancient times, and the struggle between humans and tuberculosis continues. Mycobacterium tuberculosis is the leading cause of tuberculosis, infecting nearly one-third of the world's population. The rise of peptide drugs has created a new direction in the treatment of tuberculosis. Therefore, for the treatment of tuberculosis, the prediction of anti-tuberculosis peptides is crucial.This paper proposes an anti-tuberculosis peptide prediction method based on hybrid features and stacked ensemble learning. First, a random forest (RF) and extremely randomized tree (ERT) are selected as first-level learning of stacked ensembles. Then, the five best-performing feature encoding methods are selected to obtain the hybrid feature vector, and then the decision tree and recursive feature elimination (DT-RFE) are used to refine the hybrid feature vector. After selection, the optimal feature subset is used as the input of the stacked ensemble model. At the same time, logistic regression (LR) is used as a stacked ensemble secondary learner to build the final stacked ensemble model Hyb_SEnc. The prediction accuracy of Hyb_SEnc achieved 94.68% and 95.74% on the independent test sets of AntiTb_MD and AntiTb_RD, respectively. In addition, we provide a user-friendly Web server (http://www.bioailab. com/Hyb_SEnc). The source code is freely available at https://github.com/fxh1001/Hyb_SEnc.</p>\",\"PeriodicalId\":13344,\"journal\":{\"name\":\"IEEE/ACM Transactions on Computational Biology and Bioinformatics\",\"volume\":\"PP \",\"pages\":\"\"},\"PeriodicalIF\":3.6000,\"publicationDate\":\"2024-07-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE/ACM Transactions on Computational Biology and Bioinformatics\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1109/TCBB.2024.3425644\",\"RegionNum\":3,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE/ACM Transactions on Computational Biology and Bioinformatics","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1109/TCBB.2024.3425644","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0

摘要

结核病自古以来就困扰着人类,人类与结核病的斗争仍在继续。结核分枝杆菌是结核病的主要病因,感染了全球近三分之一的人口。多肽药物的兴起为结核病的治疗开辟了新的方向。因此,对于结核病的治疗,抗结核肽的预测至关重要。本文提出了一种基于混合特征和堆叠集合学习的抗结核肽预测方法。首先,选择随机森林(RF)和极随机树(ERT)作为堆叠集合的一级学习。然后,选择五种表现最好的特征编码方法来获得混合特征向量,再用决策树和递归特征消除(DT-RFE)来完善混合特征向量。经过选择后,最优特征子集被用作堆叠集合模型的输入。同时,使用逻辑回归(LR)作为堆叠集合二级学习器,建立最终的堆叠集合模型 Hyb_SEnc。在 AntiTb_MD 和 AntiTb_RD 的独立测试集上,Hyb_SEnc 的预测准确率分别达到 94.68% 和 95.74%。此外,我们还提供了一个用户友好型网络服务器(http://www.bioailab. com/Hyb_SEnc)。源代码可在 https://github.com/fxh1001/Hyb_SEnc 免费获取。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Hyb_SEnc: An Antituberculosis Peptide Predictor Based on a Hybrid Feature Vector and Stacked Ensemble Learning.

Tuberculosis has plagued mankind since ancient times, and the struggle between humans and tuberculosis continues. Mycobacterium tuberculosis is the leading cause of tuberculosis, infecting nearly one-third of the world's population. The rise of peptide drugs has created a new direction in the treatment of tuberculosis. Therefore, for the treatment of tuberculosis, the prediction of anti-tuberculosis peptides is crucial.This paper proposes an anti-tuberculosis peptide prediction method based on hybrid features and stacked ensemble learning. First, a random forest (RF) and extremely randomized tree (ERT) are selected as first-level learning of stacked ensembles. Then, the five best-performing feature encoding methods are selected to obtain the hybrid feature vector, and then the decision tree and recursive feature elimination (DT-RFE) are used to refine the hybrid feature vector. After selection, the optimal feature subset is used as the input of the stacked ensemble model. At the same time, logistic regression (LR) is used as a stacked ensemble secondary learner to build the final stacked ensemble model Hyb_SEnc. The prediction accuracy of Hyb_SEnc achieved 94.68% and 95.74% on the independent test sets of AntiTb_MD and AntiTb_RD, respectively. In addition, we provide a user-friendly Web server (http://www.bioailab. com/Hyb_SEnc). The source code is freely available at https://github.com/fxh1001/Hyb_SEnc.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
7.50
自引率
6.70%
发文量
479
审稿时长
3 months
期刊介绍: IEEE/ACM Transactions on Computational Biology and Bioinformatics emphasizes the algorithmic, mathematical, statistical and computational methods that are central in bioinformatics and computational biology; the development and testing of effective computer programs in bioinformatics; the development of biological databases; and important biological results that are obtained from the use of these methods, programs and databases; the emerging field of Systems Biology, where many forms of data are used to create a computer-based model of a complex biological system
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信