Ahoi Jones, Hamid D. Ismail, J. H. Kim, R. Newman, B. K.C.Dukka
{"title":"RF-Phos:基于随机森林的磷酸化位点预测","authors":"Ahoi Jones, Hamid D. Ismail, J. H. Kim, R. Newman, B. K.C.Dukka","doi":"10.1109/BIBM.2015.7359670","DOIUrl":null,"url":null,"abstract":"It is estimated that about 30% of the proteins in the human proteome are regulated by phosphorylation. In recent years, phosphorylation site prediction has been investigated in the field of bioinformatics. This has become necessary due to the challenges associated with experimental methods. Previously, we developed a random forest-based method, termed Random Forest-based Phosphosite predictor (RF-Phos 1.0), to predict phosphorylation sites in proteins given only the amino acid sequence of a protein as input. Here, we report an improved version of this method, termed RF-Phos 1.1 that employs additional sequence-driven features to identify putative sites of phosphorylation across many protein families. In side-by-side comparisons based on 10-fold cross validation analysis and an independent dataset, RF-Phos 1.1 performs comparably to or better than other existing phosphosite prediction methods, such as PhosphoSVM, GPS2.1 and Musite.","PeriodicalId":186217,"journal":{"name":"2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"15 9","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"RF-Phos: Random forest-based prediction of phosphorylation sites\",\"authors\":\"Ahoi Jones, Hamid D. Ismail, J. H. Kim, R. Newman, B. K.C.Dukka\",\"doi\":\"10.1109/BIBM.2015.7359670\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"It is estimated that about 30% of the proteins in the human proteome are regulated by phosphorylation. In recent years, phosphorylation site prediction has been investigated in the field of bioinformatics. This has become necessary due to the challenges associated with experimental methods. Previously, we developed a random forest-based method, termed Random Forest-based Phosphosite predictor (RF-Phos 1.0), to predict phosphorylation sites in proteins given only the amino acid sequence of a protein as input. Here, we report an improved version of this method, termed RF-Phos 1.1 that employs additional sequence-driven features to identify putative sites of phosphorylation across many protein families. In side-by-side comparisons based on 10-fold cross validation analysis and an independent dataset, RF-Phos 1.1 performs comparably to or better than other existing phosphosite prediction methods, such as PhosphoSVM, GPS2.1 and Musite.\",\"PeriodicalId\":186217,\"journal\":{\"name\":\"2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)\",\"volume\":\"15 9\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-11-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/BIBM.2015.7359670\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBM.2015.7359670","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
RF-Phos: Random forest-based prediction of phosphorylation sites
It is estimated that about 30% of the proteins in the human proteome are regulated by phosphorylation. In recent years, phosphorylation site prediction has been investigated in the field of bioinformatics. This has become necessary due to the challenges associated with experimental methods. Previously, we developed a random forest-based method, termed Random Forest-based Phosphosite predictor (RF-Phos 1.0), to predict phosphorylation sites in proteins given only the amino acid sequence of a protein as input. Here, we report an improved version of this method, termed RF-Phos 1.1 that employs additional sequence-driven features to identify putative sites of phosphorylation across many protein families. In side-by-side comparisons based on 10-fold cross validation analysis and an independent dataset, RF-Phos 1.1 performs comparably to or better than other existing phosphosite prediction methods, such as PhosphoSVM, GPS2.1 and Musite.