Yong Gao, Weilin Hao, Jing Gu, Diwei Liu, Chao Fan, Zhigang Chen, Lei Deng
{"title":"PredPhos: an ensemble framework for structure-based prediction of phosphorylation sites.","authors":"Yong Gao, Weilin Hao, Jing Gu, Diwei Liu, Chao Fan, Zhigang Chen, Lei Deng","doi":"10.1186/s40709-016-0042-y","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Post-translational modifications (PTMs) occur on almost all proteins and often strongly affect the functions of modified proteins. Phosphorylation is a crucial PTM mechanism with important regulatory functions in biological systems. Identifying the potential phosphorylation sites of a target protein may increase our understanding of the molecular processes in which it takes part.</p><p><strong>Results: </strong>In this paper, we propose PredPhos, a computational method that can accurately predict both kinase-specific and non-kinase-specific phosphorylation sites by using optimally selected properties. The optimal combination of features was selected from a set of 153 novel structural neighborhood properties by a two-step feature selection method consisting of a random forest algorithm and a sequential backward elimination method. To overcome the imbalanced problem, we adopt an ensemble method, which combines bootstrap resampling technique, support vector machine-based fusion classifiers and majority voting strategy. We evaluate the proposed method using both tenfold cross validation and independent test. Results show that our method achieves a significant improvement on the prediction performance for both kinase-specific and non-kinase-specific phosphorylation sites.</p><p><strong>Conclusions: </strong>The experimental results demonstrate that the proposed method is quite effective in predicting phosphorylation sites. Promising results are derived from the new structural neighborhood properties, the novel way of feature selection, as well as the ensemble method.</p>","PeriodicalId":50251,"journal":{"name":"Journal of Biological Research-Thessaloniki","volume":null,"pages":null},"PeriodicalIF":1.9000,"publicationDate":"2016-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s40709-016-0042-y","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Biological Research-Thessaloniki","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s40709-016-0042-y","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2016/5/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"BIOLOGY","Score":null,"Total":0}
引用次数: 8
Abstract
Background: Post-translational modifications (PTMs) occur on almost all proteins and often strongly affect the functions of modified proteins. Phosphorylation is a crucial PTM mechanism with important regulatory functions in biological systems. Identifying the potential phosphorylation sites of a target protein may increase our understanding of the molecular processes in which it takes part.
Results: In this paper, we propose PredPhos, a computational method that can accurately predict both kinase-specific and non-kinase-specific phosphorylation sites by using optimally selected properties. The optimal combination of features was selected from a set of 153 novel structural neighborhood properties by a two-step feature selection method consisting of a random forest algorithm and a sequential backward elimination method. To overcome the imbalanced problem, we adopt an ensemble method, which combines bootstrap resampling technique, support vector machine-based fusion classifiers and majority voting strategy. We evaluate the proposed method using both tenfold cross validation and independent test. Results show that our method achieves a significant improvement on the prediction performance for both kinase-specific and non-kinase-specific phosphorylation sites.
Conclusions: The experimental results demonstrate that the proposed method is quite effective in predicting phosphorylation sites. Promising results are derived from the new structural neighborhood properties, the novel way of feature selection, as well as the ensemble method.
期刊介绍:
Journal of Biological Research-Thessaloniki is a peer-reviewed, open access, international journal that publishes articles providing novel insights into the major fields of biology.
Topics covered in Journal of Biological Research-Thessaloniki include, but are not limited to: molecular biology, cytology, genetics, evolutionary biology, morphology, development and differentiation, taxonomy, bioinformatics, physiology, marine biology, behaviour, ecology and conservation.