Computational prediction of phosphorylation sites of SARS-CoV-2 infection using feature fusion and optimization strategies

IF 4.2 3区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS
Mumdooh J. Sabir , Majid Rasool Kamli , Ahmed Atef , Alawiah M. Alhibshi , Sherif Edris , Nahid H. Hajarah , Ahmed Bahieldin , Balachandran Manavalan , Jamal S.M. Sabir
{"title":"Computational prediction of phosphorylation sites of SARS-CoV-2 infection using feature fusion and optimization strategies","authors":"Mumdooh J. Sabir ,&nbsp;Majid Rasool Kamli ,&nbsp;Ahmed Atef ,&nbsp;Alawiah M. Alhibshi ,&nbsp;Sherif Edris ,&nbsp;Nahid H. Hajarah ,&nbsp;Ahmed Bahieldin ,&nbsp;Balachandran Manavalan ,&nbsp;Jamal S.M. Sabir","doi":"10.1016/j.ymeth.2024.04.021","DOIUrl":null,"url":null,"abstract":"<div><p>SARS-CoV-2′s global spread has instigated a critical health and economic emergency, impacting countless individuals. Understanding the virus's phosphorylation sites is vital to unravel the molecular intricacies of the infection and subsequent changes in host cellular processes. Several computational methods have been proposed to identify phosphorylation sites, typically focusing on specific residue (S/T) or Y phosphorylation sites. Unfortunately, current predictive tools perform best on these specific residues and may not extend their efficacy to other residues, emphasizing the urgent need for enhanced methodologies. In this study, we developed a novel predictor that integrated all the residues (STY) phosphorylation sites information. We extracted ten different feature descriptors, primarily derived from composition, evolutionary, and position-specific information, and assessed their discriminative power through five classifiers. Our results indicated that Light Gradient Boosting (LGB) showed superior performance, and five descriptors displayed excellent discriminative capabilities. Subsequently, we identified the top two integrated features have high discriminative capability and trained with LGB to develop the final prediction model, LGB-IPs. The proposed approach shows an excellent performance on 10-fold cross-validation with an ACC, MCC, and AUC values of 0.831, 0.662, 0.907, respectively. Notably, these performances are replicated in the independent evaluation. Consequently, our approach may provide valuable insights into the phosphorylation mechanisms in SARS-CoV-2 infection for biomedical researchers.</p></div>","PeriodicalId":390,"journal":{"name":"Methods","volume":"229 ","pages":"Pages 1-8"},"PeriodicalIF":4.2000,"publicationDate":"2024-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Methods","FirstCategoryId":"99","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1046202324001300","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

SARS-CoV-2′s global spread has instigated a critical health and economic emergency, impacting countless individuals. Understanding the virus's phosphorylation sites is vital to unravel the molecular intricacies of the infection and subsequent changes in host cellular processes. Several computational methods have been proposed to identify phosphorylation sites, typically focusing on specific residue (S/T) or Y phosphorylation sites. Unfortunately, current predictive tools perform best on these specific residues and may not extend their efficacy to other residues, emphasizing the urgent need for enhanced methodologies. In this study, we developed a novel predictor that integrated all the residues (STY) phosphorylation sites information. We extracted ten different feature descriptors, primarily derived from composition, evolutionary, and position-specific information, and assessed their discriminative power through five classifiers. Our results indicated that Light Gradient Boosting (LGB) showed superior performance, and five descriptors displayed excellent discriminative capabilities. Subsequently, we identified the top two integrated features have high discriminative capability and trained with LGB to develop the final prediction model, LGB-IPs. The proposed approach shows an excellent performance on 10-fold cross-validation with an ACC, MCC, and AUC values of 0.831, 0.662, 0.907, respectively. Notably, these performances are replicated in the independent evaluation. Consequently, our approach may provide valuable insights into the phosphorylation mechanisms in SARS-CoV-2 infection for biomedical researchers.

利用特征融合和优化策略计算预测 SARS-CoV-2 感染的磷酸化位点
SARS-CoV-2 在全球的传播引发了严重的健康和经济危机,影响了无数人。了解病毒的磷酸化位点对于揭示病毒感染和宿主细胞过程随之发生变化的复杂分子机制至关重要。目前已经提出了几种识别磷酸化位点的计算方法,通常侧重于特定残基(S/T)或Y磷酸化位点。遗憾的是,目前的预测工具在这些特定残基上表现最佳,可能无法将其功效扩展到其他残基,这就强调了对增强型方法的迫切需求。在这项研究中,我们开发了一种新型预测工具,它整合了所有残基(STY)磷酸化位点信息。我们提取了十种不同的特征描述符,主要来自组成、进化和特定位置信息,并通过五种分类器评估了它们的判别能力。结果表明,光梯度提升法(LGB)表现出了卓越的性能,有五个描述符显示出了出色的判别能力。随后,我们确定了具有高判别能力的前两个综合特征,并使用 LGB 进行训练,开发出最终的预测模型 LGB-IPs。所提出的方法在 10 倍交叉验证中表现出色,ACC、MCC 和 AUC 值分别为 0.831、0.662 和 0.907。值得注意的是,这些性能在独立评估中也得到了验证。因此,我们的方法可以为生物医学研究人员提供有关 SARS-CoV-2 感染中磷酸化机制的宝贵见解。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Methods
Methods 生物-生化研究方法
CiteScore
9.80
自引率
2.10%
发文量
222
审稿时长
11.3 weeks
期刊介绍: Methods focuses on rapidly developing techniques in the experimental biological and medical sciences. Each topical issue, organized by a guest editor who is an expert in the area covered, consists solely of invited quality articles by specialist authors, many of them reviews. Issues are devoted to specific technical approaches with emphasis on clear detailed descriptions of protocols that allow them to be reproduced easily. The background information provided enables researchers to understand the principles underlying the methods; other helpful sections include comparisons of alternative methods giving the advantages and disadvantages of particular methods, guidance on avoiding potential pitfalls, and suggestions for troubleshooting.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信