Classification of proteins in intracellular and secretory pathway using global descriptors of amino acid sequence

G. Govindan, A. Nair
{"title":"Classification of proteins in intracellular and secretory pathway using global descriptors of amino acid sequence","authors":"G. Govindan, A. Nair","doi":"10.1109/WICT.2011.6141236","DOIUrl":null,"url":null,"abstract":"It is widely recognized that the information from the amino acid sequence can serve as crucial pointers in predicting subcellular location of proteins. We introduce a new feature vector for predicting proteins targeted to various compartments in the intracellular and secretory pathway from protein sequence. Features are based on the global Composition, Transition and Distribution (CTD) of amino acid attributes such as hydrophobicity, normalized van der Waals volume, polarity, polarizability, charge, secondary structure and solvent accessibility. Sequences are considered in three equal parts and the features are extracted separately for all the three parts. Based on the feature vectors, we have trained a Support Vector Machine to classify intracellular and secretory proteins. Our method gives an accuracy of 92% in human, 88% in plant and 95% in fungi with independent dataset at root level of the protein sorting pathway.","PeriodicalId":178645,"journal":{"name":"2011 World Congress on Information and Communication Technologies","volume":"72 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 World Congress on Information and Communication Technologies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WICT.2011.6141236","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

It is widely recognized that the information from the amino acid sequence can serve as crucial pointers in predicting subcellular location of proteins. We introduce a new feature vector for predicting proteins targeted to various compartments in the intracellular and secretory pathway from protein sequence. Features are based on the global Composition, Transition and Distribution (CTD) of amino acid attributes such as hydrophobicity, normalized van der Waals volume, polarity, polarizability, charge, secondary structure and solvent accessibility. Sequences are considered in three equal parts and the features are extracted separately for all the three parts. Based on the feature vectors, we have trained a Support Vector Machine to classify intracellular and secretory proteins. Our method gives an accuracy of 92% in human, 88% in plant and 95% in fungi with independent dataset at root level of the protein sorting pathway.
利用氨基酸序列的全局描述符对细胞内和分泌途径中的蛋白质进行分类
人们普遍认为,氨基酸序列的信息可以作为预测蛋白质亚细胞定位的关键指标。我们引入了一种新的特征向量,用于从蛋白质序列中预测细胞内和分泌途径中针对不同区室的蛋白质。这些特征是基于氨基酸属性的整体组成、过渡和分布(CTD),如疏水性、标准化范德华体积、极性、极化率、电荷、二级结构和溶剂可及性。将序列分成三个相等的部分,并分别提取三个部分的特征。基于特征向量,我们训练了一个支持向量机对细胞内和分泌蛋白进行分类。我们的方法在蛋白质分选途径的根水平上具有独立的数据集,人类的准确率为92%,植物的准确率为88%,真菌的准确率为95%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信