Bigram-PGK: phosphoglycerylation prediction using the technique of bigram probabilities of position specific scoring matrix.

IF 2.4 3区 生物学 Q4 CELL BIOLOGY
Abel Chandra, Alok Sharma, Abdollah Dehzangi, Daichi Shigemizu, Tatsuhiko Tsunoda
{"title":"Bigram-PGK: phosphoglycerylation prediction using the technique of bigram probabilities of position specific scoring matrix.","authors":"Abel Chandra,&nbsp;Alok Sharma,&nbsp;Abdollah Dehzangi,&nbsp;Daichi Shigemizu,&nbsp;Tatsuhiko Tsunoda","doi":"10.1186/s12860-019-0240-1","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>The biological process known as post-translational modification (PTM) is a condition whereby proteomes are modified that affects normal cell biology, and hence the pathogenesis. A number of PTMs have been discovered in the recent years and lysine phosphoglycerylation is one of the fairly recent developments. Even with a large number of proteins being sequenced in the post-genomic era, the identification of phosphoglycerylation remains a big challenge due to factors such as cost, time consumption and inefficiency involved in the experimental efforts. To overcome this issue, computational techniques have emerged to accurately identify phosphoglycerylated lysine residues. However, the computational techniques proposed so far hold limitations to correctly predict this covalent modification.</p><p><strong>Results: </strong>We propose a new predictor in this paper called Bigram-PGK which uses evolutionary information of amino acids to try and predict phosphoglycerylated sites. The benchmark dataset which contains experimentally labelled sites is employed for this purpose and profile bigram occurrences is calculated from position specific scoring matrices of amino acids in the protein sequences. The statistical measures of this work, such as sensitivity, specificity, precision, accuracy, Mathews correlation coefficient and area under ROC curve have been reported to be 0.9642, 0.8973, 0.8253, 0.9193, 0.8330, 0.9306, respectively.</p><p><strong>Conclusions: </strong>The proposed predictor, based on the feature of evolutionary information and support vector machine classifier, has shown great potential to effectively predict phosphoglycerylated and non-phosphoglycerylated lysine residues when compared against the existing predictors. The data and software of this work can be acquired from https://github.com/abelavit/Bigram-PGK.</p>","PeriodicalId":9099,"journal":{"name":"BMC Molecular and Cell Biology","volume":"20 Suppl 2","pages":"57"},"PeriodicalIF":2.4000,"publicationDate":"2019-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s12860-019-0240-1","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Molecular and Cell Biology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12860-019-0240-1","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"CELL BIOLOGY","Score":null,"Total":0}
引用次数: 13

Abstract

Background: The biological process known as post-translational modification (PTM) is a condition whereby proteomes are modified that affects normal cell biology, and hence the pathogenesis. A number of PTMs have been discovered in the recent years and lysine phosphoglycerylation is one of the fairly recent developments. Even with a large number of proteins being sequenced in the post-genomic era, the identification of phosphoglycerylation remains a big challenge due to factors such as cost, time consumption and inefficiency involved in the experimental efforts. To overcome this issue, computational techniques have emerged to accurately identify phosphoglycerylated lysine residues. However, the computational techniques proposed so far hold limitations to correctly predict this covalent modification.

Results: We propose a new predictor in this paper called Bigram-PGK which uses evolutionary information of amino acids to try and predict phosphoglycerylated sites. The benchmark dataset which contains experimentally labelled sites is employed for this purpose and profile bigram occurrences is calculated from position specific scoring matrices of amino acids in the protein sequences. The statistical measures of this work, such as sensitivity, specificity, precision, accuracy, Mathews correlation coefficient and area under ROC curve have been reported to be 0.9642, 0.8973, 0.8253, 0.9193, 0.8330, 0.9306, respectively.

Conclusions: The proposed predictor, based on the feature of evolutionary information and support vector machine classifier, has shown great potential to effectively predict phosphoglycerylated and non-phosphoglycerylated lysine residues when compared against the existing predictors. The data and software of this work can be acquired from https://github.com/abelavit/Bigram-PGK.

Abstract Image

biggram - pgk:使用位置特定评分矩阵的biggram概率技术进行磷酸甘油酰化预测。
生物学过程被称为翻译后修饰(PTM)是蛋白质组被修饰的一种情况,影响正常细胞生物学,从而影响发病机制。近年来发现了许多PTMs,赖氨酸磷酸甘油化是最近的发展之一。即使在后基因组时代对大量蛋白质进行了测序,由于实验工作中涉及的成本、时间消耗和效率低下等因素,磷酸甘油化的鉴定仍然是一个巨大的挑战。为了克服这个问题,计算技术已经出现,以准确地识别磷酸甘油化赖氨酸残基。然而,迄今为止提出的计算技术在正确预测这种共价修饰方面存在局限性。结果:本文提出了一种新的预测因子,称为biggram - pgk,它利用氨基酸的进化信息来尝试预测磷酸甘油化位点。包含实验标记位点的基准数据集用于此目的,并且从蛋白质序列中氨基酸的位置特定评分矩阵计算剖面双图的发生率。敏感度、特异度、精密度、准确度、马修斯相关系数、ROC曲线下面积等统计指标分别为0.9642、0.8973、0.8253、0.9193、0.8330、0.9306。结论:与现有预测器相比,基于进化信息和支持向量机分类器特征的预测器在预测磷酸甘油化赖氨酸残基和非磷酸甘油化赖氨酸残基方面显示出巨大的潜力。这项工作的数据和软件可以从https://github.com/abelavit/Bigram-PGK获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
BMC Molecular and Cell Biology
BMC Molecular and Cell Biology Biochemistry, Genetics and Molecular Biology-Cell Biology
CiteScore
5.50
自引率
0.00%
发文量
46
审稿时长
27 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信