{"title":"基于支持向量机的多类蛋白质亚细胞定位预测","authors":"Peng Wai Meng, Jagath Rajapakse","doi":"10.1109/CIBCB.2005.1594964","DOIUrl":null,"url":null,"abstract":"Prediction of protein subcellular localization from amino acid sequence is an important step towards elucidating the function of a protein. Here, we present an approach for predicting protein subcellular localizations from eukaryotic sequences using Support Vector Machines. Apart from using amino acid compositions, our prediction approach also considers biochemical characteristics of amino acids and their distribution patterns along the primary sequence of the query proteins. Consequently, improved predictive accuracy has been achieved on the Reinhardt and Hubbard’s dataset. For the four subcellular localizations of eukaryotic proteins, the total prediction accuracy obtained using the “ leave-one-out” cross-validation test is 88.88%. To the best of our knowledge, our approach obtained by far the best prediction accuracy for mitochondrial proteins, which are notoriously difficult to predict among eukaryotic proteins. Performance comparison results also showed that our approach outperformed existing protein subcellular localization prediction methods based solely on amino acid composition.","PeriodicalId":330810,"journal":{"name":"2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Multi-Class Protein Subcellular Localization Prediction using Support Vector Machines\",\"authors\":\"Peng Wai Meng, Jagath Rajapakse\",\"doi\":\"10.1109/CIBCB.2005.1594964\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Prediction of protein subcellular localization from amino acid sequence is an important step towards elucidating the function of a protein. Here, we present an approach for predicting protein subcellular localizations from eukaryotic sequences using Support Vector Machines. Apart from using amino acid compositions, our prediction approach also considers biochemical characteristics of amino acids and their distribution patterns along the primary sequence of the query proteins. Consequently, improved predictive accuracy has been achieved on the Reinhardt and Hubbard’s dataset. For the four subcellular localizations of eukaryotic proteins, the total prediction accuracy obtained using the “ leave-one-out” cross-validation test is 88.88%. To the best of our knowledge, our approach obtained by far the best prediction accuracy for mitochondrial proteins, which are notoriously difficult to predict among eukaryotic proteins. Performance comparison results also showed that our approach outperformed existing protein subcellular localization prediction methods based solely on amino acid composition.\",\"PeriodicalId\":330810,\"journal\":{\"name\":\"2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CIBCB.2005.1594964\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIBCB.2005.1594964","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Multi-Class Protein Subcellular Localization Prediction using Support Vector Machines
Prediction of protein subcellular localization from amino acid sequence is an important step towards elucidating the function of a protein. Here, we present an approach for predicting protein subcellular localizations from eukaryotic sequences using Support Vector Machines. Apart from using amino acid compositions, our prediction approach also considers biochemical characteristics of amino acids and their distribution patterns along the primary sequence of the query proteins. Consequently, improved predictive accuracy has been achieved on the Reinhardt and Hubbard’s dataset. For the four subcellular localizations of eukaryotic proteins, the total prediction accuracy obtained using the “ leave-one-out” cross-validation test is 88.88%. To the best of our knowledge, our approach obtained by far the best prediction accuracy for mitochondrial proteins, which are notoriously difficult to predict among eukaryotic proteins. Performance comparison results also showed that our approach outperformed existing protein subcellular localization prediction methods based solely on amino acid composition.