{"title":"Identification of chloroplast and sub-chloroplast proteins from sequence-attributed features using support vector machine and domain information","authors":"Ravindra Kumar, Anjali Garg, B. Kumari, Aakriti Jain, Manish Kumar, Equal Contribution","doi":"10.1109/CIBCB49929.2021.9562787","DOIUrl":null,"url":null,"abstract":"Chloroplasts are one of the most important organelles of plants including algae. A number of vital biological pathways in them are exclusively confined to the chloroplast, thus, indicating the significance of chloroplastidic proteins. Hence, prediction of chloroplastidic proteins and their localization within the chloroplast (sub-chloroplastidial localization) can be of paramount importance in understanding the role of both novel and known chloroplastidic proteins. Several experimental methods have been developed to determine the subcellular localization of proteins; however, annotation of every protein using the experimental methods requires a lot of time and resources. To overcome these shortcomings of experimental approaches many computational methods have been proposed which minimize the amount of time and resources required. In pursuit of speeding up the prediction of chloroplastidic proteins and their sub-chloroplastidial localization besides maintaining efficiency, we developed Pfam domain and support vector machine based two level prediction frameworks, namely, SubChloroPred. At first level, SubChloroPred predicts the chloroplastidic proteins and at second level, their localization at sub-chloroplastidic locations such as thylakoid and stroma would be predicted. SubChloroPred has overall prediction accuracy of 94.86% at the first level and accuracies of 75.91% and 74.26% at the second level in thylakoid and stroma respectively. We have also developed a freely accessible webserver as well as standalone software for the use of scientific community, which can be accessed from the link http://proteininformatics.org/mkumar/SubChloroPred.","PeriodicalId":163387,"journal":{"name":"2021 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB)","volume":"122 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIBCB49929.2021.9562787","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Chloroplasts are one of the most important organelles of plants including algae. A number of vital biological pathways in them are exclusively confined to the chloroplast, thus, indicating the significance of chloroplastidic proteins. Hence, prediction of chloroplastidic proteins and their localization within the chloroplast (sub-chloroplastidial localization) can be of paramount importance in understanding the role of both novel and known chloroplastidic proteins. Several experimental methods have been developed to determine the subcellular localization of proteins; however, annotation of every protein using the experimental methods requires a lot of time and resources. To overcome these shortcomings of experimental approaches many computational methods have been proposed which minimize the amount of time and resources required. In pursuit of speeding up the prediction of chloroplastidic proteins and their sub-chloroplastidial localization besides maintaining efficiency, we developed Pfam domain and support vector machine based two level prediction frameworks, namely, SubChloroPred. At first level, SubChloroPred predicts the chloroplastidic proteins and at second level, their localization at sub-chloroplastidic locations such as thylakoid and stroma would be predicted. SubChloroPred has overall prediction accuracy of 94.86% at the first level and accuracies of 75.91% and 74.26% at the second level in thylakoid and stroma respectively. We have also developed a freely accessible webserver as well as standalone software for the use of scientific community, which can be accessed from the link http://proteininformatics.org/mkumar/SubChloroPred.