基于支持向量机和域信息的叶绿体和亚叶绿体蛋白质序列特征识别

Ravindra Kumar, Anjali Garg, B. Kumari, Aakriti Jain, Manish Kumar, Equal Contribution
{"title":"基于支持向量机和域信息的叶绿体和亚叶绿体蛋白质序列特征识别","authors":"Ravindra Kumar, Anjali Garg, B. Kumari, Aakriti Jain, Manish Kumar, Equal Contribution","doi":"10.1109/CIBCB49929.2021.9562787","DOIUrl":null,"url":null,"abstract":"Chloroplasts are one of the most important organelles of plants including algae. A number of vital biological pathways in them are exclusively confined to the chloroplast, thus, indicating the significance of chloroplastidic proteins. Hence, prediction of chloroplastidic proteins and their localization within the chloroplast (sub-chloroplastidial localization) can be of paramount importance in understanding the role of both novel and known chloroplastidic proteins. Several experimental methods have been developed to determine the subcellular localization of proteins; however, annotation of every protein using the experimental methods requires a lot of time and resources. To overcome these shortcomings of experimental approaches many computational methods have been proposed which minimize the amount of time and resources required. In pursuit of speeding up the prediction of chloroplastidic proteins and their sub-chloroplastidial localization besides maintaining efficiency, we developed Pfam domain and support vector machine based two level prediction frameworks, namely, SubChloroPred. At first level, SubChloroPred predicts the chloroplastidic proteins and at second level, their localization at sub-chloroplastidic locations such as thylakoid and stroma would be predicted. SubChloroPred has overall prediction accuracy of 94.86% at the first level and accuracies of 75.91% and 74.26% at the second level in thylakoid and stroma respectively. We have also developed a freely accessible webserver as well as standalone software for the use of scientific community, which can be accessed from the link http://proteininformatics.org/mkumar/SubChloroPred.","PeriodicalId":163387,"journal":{"name":"2021 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB)","volume":"122 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Identification of chloroplast and sub-chloroplast proteins from sequence-attributed features using support vector machine and domain information\",\"authors\":\"Ravindra Kumar, Anjali Garg, B. Kumari, Aakriti Jain, Manish Kumar, Equal Contribution\",\"doi\":\"10.1109/CIBCB49929.2021.9562787\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Chloroplasts are one of the most important organelles of plants including algae. A number of vital biological pathways in them are exclusively confined to the chloroplast, thus, indicating the significance of chloroplastidic proteins. Hence, prediction of chloroplastidic proteins and their localization within the chloroplast (sub-chloroplastidial localization) can be of paramount importance in understanding the role of both novel and known chloroplastidic proteins. Several experimental methods have been developed to determine the subcellular localization of proteins; however, annotation of every protein using the experimental methods requires a lot of time and resources. To overcome these shortcomings of experimental approaches many computational methods have been proposed which minimize the amount of time and resources required. In pursuit of speeding up the prediction of chloroplastidic proteins and their sub-chloroplastidial localization besides maintaining efficiency, we developed Pfam domain and support vector machine based two level prediction frameworks, namely, SubChloroPred. At first level, SubChloroPred predicts the chloroplastidic proteins and at second level, their localization at sub-chloroplastidic locations such as thylakoid and stroma would be predicted. SubChloroPred has overall prediction accuracy of 94.86% at the first level and accuracies of 75.91% and 74.26% at the second level in thylakoid and stroma respectively. We have also developed a freely accessible webserver as well as standalone software for the use of scientific community, which can be accessed from the link http://proteininformatics.org/mkumar/SubChloroPred.\",\"PeriodicalId\":163387,\"journal\":{\"name\":\"2021 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB)\",\"volume\":\"122 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-10-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CIBCB49929.2021.9562787\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIBCB49929.2021.9562787","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

叶绿体是包括藻类在内的植物最重要的细胞器之一。其中一些重要的生物途径完全局限于叶绿体,因此,表明了叶绿体蛋白的重要性。因此,预测叶绿体蛋白及其在叶绿体中的定位(亚叶绿体定位)对于理解新的和已知的叶绿体蛋白的作用至关重要。已经开发了几种实验方法来确定蛋白质的亚细胞定位;然而,使用实验方法对每个蛋白质进行注释需要大量的时间和资源。为了克服实验方法的这些缺点,人们提出了许多计算方法,以尽量减少所需的时间和资源。为了在保持效率的同时加快叶绿体蛋白及其亚叶绿体定位的预测速度,我们开发了基于Pfam结构域和支持向量机的两个层次预测框架,即SubChloroPred。在第一级,SubChloroPred预测叶绿体蛋白,在第二级,它们在亚叶绿体位置(如类囊体和基质)的定位将被预测。SubChloroPred对类囊体和基质的一级预测精度为94.86%,二级预测精度分别为75.91%和74.26%。我们还开发了一个免费的网络服务器和独立软件,供科学界使用,可从链接http://proteininformatics.org/mkumar/SubChloroPred访问。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Identification of chloroplast and sub-chloroplast proteins from sequence-attributed features using support vector machine and domain information
Chloroplasts are one of the most important organelles of plants including algae. A number of vital biological pathways in them are exclusively confined to the chloroplast, thus, indicating the significance of chloroplastidic proteins. Hence, prediction of chloroplastidic proteins and their localization within the chloroplast (sub-chloroplastidial localization) can be of paramount importance in understanding the role of both novel and known chloroplastidic proteins. Several experimental methods have been developed to determine the subcellular localization of proteins; however, annotation of every protein using the experimental methods requires a lot of time and resources. To overcome these shortcomings of experimental approaches many computational methods have been proposed which minimize the amount of time and resources required. In pursuit of speeding up the prediction of chloroplastidic proteins and their sub-chloroplastidial localization besides maintaining efficiency, we developed Pfam domain and support vector machine based two level prediction frameworks, namely, SubChloroPred. At first level, SubChloroPred predicts the chloroplastidic proteins and at second level, their localization at sub-chloroplastidic locations such as thylakoid and stroma would be predicted. SubChloroPred has overall prediction accuracy of 94.86% at the first level and accuracies of 75.91% and 74.26% at the second level in thylakoid and stroma respectively. We have also developed a freely accessible webserver as well as standalone software for the use of scientific community, which can be accessed from the link http://proteininformatics.org/mkumar/SubChloroPred.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信