An improved method to enhance protein structural class prediction using their secondary structure sequences and genetic algorithm

M. H. Aldulaimi, S. Zainudin, A. Bakar
{"title":"An improved method to enhance protein structural class prediction using their secondary structure sequences and genetic algorithm","authors":"M. H. Aldulaimi, S. Zainudin, A. Bakar","doi":"10.1504/IJBRA.2018.10009965","DOIUrl":null,"url":null,"abstract":"Many approaches have been proposed to enhance the accuracy of protein structural class. However, such approaches did not cover the low-similarity sequences which are proved to be quite challenging. In this study, a 71-dimensional integrated feature vector is extracted from the predicted secondary structure and hydropathy sequence using newly devised strategies for the purpose of categorising proteins into their major structural classes: all-α, all-β, α/β and α+β. A new combined method containing two machine learning algorithms has been proposed for feature selections in this study. Support vector machine (SVM) and genetic algorithm (GA) are combined using the wrapper method for the purpose of selecting top N features based on the level of their importance. The proposed method is evaluated using the jackknife upon two low-similarity sequences datasets, i.e. ASTRAL and D640. The overall accuracies of 83.93 and 92.2% are reported for the predictions pertaining to ASTRALtesting and D640 benchmarks, exceeding most of the current approaches.","PeriodicalId":434900,"journal":{"name":"Int. J. Bioinform. Res. Appl.","volume":"50 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Int. J. Bioinform. Res. Appl.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1504/IJBRA.2018.10009965","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Many approaches have been proposed to enhance the accuracy of protein structural class. However, such approaches did not cover the low-similarity sequences which are proved to be quite challenging. In this study, a 71-dimensional integrated feature vector is extracted from the predicted secondary structure and hydropathy sequence using newly devised strategies for the purpose of categorising proteins into their major structural classes: all-α, all-β, α/β and α+β. A new combined method containing two machine learning algorithms has been proposed for feature selections in this study. Support vector machine (SVM) and genetic algorithm (GA) are combined using the wrapper method for the purpose of selecting top N features based on the level of their importance. The proposed method is evaluated using the jackknife upon two low-similarity sequences datasets, i.e. ASTRAL and D640. The overall accuracies of 83.93 and 92.2% are reported for the predictions pertaining to ASTRALtesting and D640 benchmarks, exceeding most of the current approaches.
一种利用蛋白质二级结构序列和遗传算法增强蛋白质结构分类预测的改进方法
人们提出了许多方法来提高蛋白质结构分类的准确性。然而,这种方法并没有涵盖低相似性序列,这被证明是相当具有挑战性的。在这项研究中,利用新设计的策略从预测的二级结构和亲水序列中提取了一个71维的综合特征向量,目的是将蛋白质分类为主要的结构类:全α、全β、α/β和α+β。本文提出了一种包含两种机器学习算法的特征选择新方法。采用包装方法将支持向量机(SVM)和遗传算法(GA)结合起来,根据重要程度选择top N个特征。在ASTRAL和D640两个低相似度序列数据集上,利用刀切法对该方法进行了评价。据报道,与ASTRALtesting和D640基准测试相关的预测的总体准确率为83.93和92.2%,超过了目前大多数方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信