Sagnik Banerjee, Subhadip Basu, Debjyoti Ghosh, M. Nasipuri
{"title":"PhospredRF:使用随机森林分类器的共识预测蛋白质磷酸化位点","authors":"Sagnik Banerjee, Subhadip Basu, Debjyoti Ghosh, M. Nasipuri","doi":"10.1109/IEMCON.2015.7344514","DOIUrl":null,"url":null,"abstract":"Post translational modification (PTM) is a process by which proteins undergo chemical changes after they are translated from RNA. Among the various types of PTM, phosphorylation is the most important one since it assists in almost all the activities of the cell. In this research work we have used machine learning based approaches to predict the position where phosphorylation has occurred. Random forest has been used as the machine learning tool for this work. As features we have used evolutionary information extracted from Position Specific Scoring Matrices (PSSM). When tested with an independent set of 141 proteins our system achieved an AUC of 0.699. Also our system could attain the best performance for a set of 22 non-trivial proteins.","PeriodicalId":111626,"journal":{"name":"2015 International Conference and Workshop on Computing and Communication (IEMCON)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"PhospredRF: Prediction of protein phosphorylation sites using a consensus of random forest classifiers\",\"authors\":\"Sagnik Banerjee, Subhadip Basu, Debjyoti Ghosh, M. Nasipuri\",\"doi\":\"10.1109/IEMCON.2015.7344514\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Post translational modification (PTM) is a process by which proteins undergo chemical changes after they are translated from RNA. Among the various types of PTM, phosphorylation is the most important one since it assists in almost all the activities of the cell. In this research work we have used machine learning based approaches to predict the position where phosphorylation has occurred. Random forest has been used as the machine learning tool for this work. As features we have used evolutionary information extracted from Position Specific Scoring Matrices (PSSM). When tested with an independent set of 141 proteins our system achieved an AUC of 0.699. Also our system could attain the best performance for a set of 22 non-trivial proteins.\",\"PeriodicalId\":111626,\"journal\":{\"name\":\"2015 International Conference and Workshop on Computing and Communication (IEMCON)\",\"volume\":\"8 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-12-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 International Conference and Workshop on Computing and Communication (IEMCON)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IEMCON.2015.7344514\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 International Conference and Workshop on Computing and Communication (IEMCON)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IEMCON.2015.7344514","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
PhospredRF: Prediction of protein phosphorylation sites using a consensus of random forest classifiers
Post translational modification (PTM) is a process by which proteins undergo chemical changes after they are translated from RNA. Among the various types of PTM, phosphorylation is the most important one since it assists in almost all the activities of the cell. In this research work we have used machine learning based approaches to predict the position where phosphorylation has occurred. Random forest has been used as the machine learning tool for this work. As features we have used evolutionary information extracted from Position Specific Scoring Matrices (PSSM). When tested with an independent set of 141 proteins our system achieved an AUC of 0.699. Also our system could attain the best performance for a set of 22 non-trivial proteins.