预测蛋白质亚细胞定位:基于多目标粒子群算法的蛋白质氨基酸序列特征子集选择

M. Mandal, A. Mukhopadhyay
{"title":"预测蛋白质亚细胞定位:基于多目标粒子群算法的蛋白质氨基酸序列特征子集选择","authors":"M. Mandal, A. Mukhopadhyay","doi":"10.1109/ICIT.2014.75","DOIUrl":null,"url":null,"abstract":"In this article, the probable sub cellular location of a protein is predicted by applying multiobjective particle swarm optimization (MOPSO) based feature selection technique. The feature set is created from the different amino acid compositions of the protein. Thus, the sample of protein versus amino acid compositions (features) constitutes the dataset. The proposed algorithm is designed to find subset of features so that the feature relevance is maximized and feature redundancy is minimized simultaneously. After proposed algorithm is executed on the multiclass dataset, some features are selected. Using this resultant features 10-folds cross validation is applied and corresponding accuracy, f-score, entropy, representation entropy and average correlation are calculated. The performance of the proposed method is compared with that of its single objective versions, Sequential Forward Search, Sequential Backward Search and minimum Redundancy Maximum Relevance with two schemes.","PeriodicalId":6486,"journal":{"name":"2014 17th International Conference on Computer and Information Technology (ICCIT)","volume":"18 1","pages":"251-255"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Predicting Protein Subcellular Localization: A Multiobjective PSO-based Feature Subset Selection from Amino Acid Sequence of Protein\",\"authors\":\"M. Mandal, A. Mukhopadhyay\",\"doi\":\"10.1109/ICIT.2014.75\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this article, the probable sub cellular location of a protein is predicted by applying multiobjective particle swarm optimization (MOPSO) based feature selection technique. The feature set is created from the different amino acid compositions of the protein. Thus, the sample of protein versus amino acid compositions (features) constitutes the dataset. The proposed algorithm is designed to find subset of features so that the feature relevance is maximized and feature redundancy is minimized simultaneously. After proposed algorithm is executed on the multiclass dataset, some features are selected. Using this resultant features 10-folds cross validation is applied and corresponding accuracy, f-score, entropy, representation entropy and average correlation are calculated. The performance of the proposed method is compared with that of its single objective versions, Sequential Forward Search, Sequential Backward Search and minimum Redundancy Maximum Relevance with two schemes.\",\"PeriodicalId\":6486,\"journal\":{\"name\":\"2014 17th International Conference on Computer and Information Technology (ICCIT)\",\"volume\":\"18 1\",\"pages\":\"251-255\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-12-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 17th International Conference on Computer and Information Technology (ICCIT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICIT.2014.75\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 17th International Conference on Computer and Information Technology (ICCIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICIT.2014.75","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

本文采用基于多目标粒子群优化(MOPSO)的特征选择技术,预测了蛋白质可能的亚细胞位置。特征集是由蛋白质的不同氨基酸组成创建的。因此,蛋白质与氨基酸组成(特征)的样本构成了数据集。该算法旨在寻找特征子集,以实现特征相关性最大化和特征冗余最小化。在多类数据集上执行该算法后,选择了一些特征。利用该特征进行10倍交叉验证,并计算相应的精度、f-score、熵、表示熵和平均相关性。将该方法的性能与单目标版本、顺序前向搜索、顺序后向搜索和最小冗余最大关联两种方案进行了比较。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Predicting Protein Subcellular Localization: A Multiobjective PSO-based Feature Subset Selection from Amino Acid Sequence of Protein
In this article, the probable sub cellular location of a protein is predicted by applying multiobjective particle swarm optimization (MOPSO) based feature selection technique. The feature set is created from the different amino acid compositions of the protein. Thus, the sample of protein versus amino acid compositions (features) constitutes the dataset. The proposed algorithm is designed to find subset of features so that the feature relevance is maximized and feature redundancy is minimized simultaneously. After proposed algorithm is executed on the multiclass dataset, some features are selected. Using this resultant features 10-folds cross validation is applied and corresponding accuracy, f-score, entropy, representation entropy and average correlation are calculated. The performance of the proposed method is compared with that of its single objective versions, Sequential Forward Search, Sequential Backward Search and minimum Redundancy Maximum Relevance with two schemes.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信