Bioinformatics Approach to Classification of Four Classes of Organism in Relation to Their Optimal Growth Temperature

Hanaa M. Hussain, H. Seker, Malde Gorania, Newcastle Upon-Tyne United Kingdom Environment Newcastle
{"title":"Bioinformatics Approach to Classification of Four Classes of Organism in Relation to Their Optimal Growth Temperature","authors":"Hanaa M. Hussain, H. Seker, Malde Gorania, Newcastle Upon-Tyne United Kingdom Environment Newcastle","doi":"10.18178/ijpmbs.7.4.78-83","DOIUrl":null,"url":null,"abstract":" —Identifying the temperature class of proteins in prokaryotic organisms is one of the vital problems in enzyme and protein engineering. In this work, an efficient K-NN predictive models have been developed to discriminate hyperthermophilic, thermophilic, psychrophilic, and mesophilic proteins using Amino acid and Pseudo amino acid compositions. The two predictive models were built and tested with a large dataset consisting of 6631 hyperthermophiles, 11,700 thermophiles, 6267 psychrophiles, and 67,037 mesophiles. Implementation and analysis results showed that the proposed K-NN based predictive models were capable of discriminating the four classes efficiently and with high accuracies, whereby the Amino acid composition model achieved 94% accuracy when using 10-fold cross-validation, and 98% when using hold-out test. on the other hand, the Pseud amino acid composition based model achieved an accuracy of 99% using hold-out test.","PeriodicalId":281523,"journal":{"name":"International Journal of Pharma Medicine and Biological Sciences","volume":"341 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Pharma Medicine and Biological Sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18178/ijpmbs.7.4.78-83","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

 —Identifying the temperature class of proteins in prokaryotic organisms is one of the vital problems in enzyme and protein engineering. In this work, an efficient K-NN predictive models have been developed to discriminate hyperthermophilic, thermophilic, psychrophilic, and mesophilic proteins using Amino acid and Pseudo amino acid compositions. The two predictive models were built and tested with a large dataset consisting of 6631 hyperthermophiles, 11,700 thermophiles, 6267 psychrophiles, and 67,037 mesophiles. Implementation and analysis results showed that the proposed K-NN based predictive models were capable of discriminating the four classes efficiently and with high accuracies, whereby the Amino acid composition model achieved 94% accuracy when using 10-fold cross-validation, and 98% when using hold-out test. on the other hand, the Pseud amino acid composition based model achieved an accuracy of 99% using hold-out test.
四类生物最佳生长温度分类的生物信息学方法
-确定原核生物中蛋白质的温度类别是酶和蛋白质工程中的重要问题之一。在这项工作中,开发了一个有效的K-NN预测模型,使用氨基酸和伪氨基酸组成来区分超嗜热、嗜热、嗜冷和嗜中温蛋白质。这两种预测模型在一个由6631种超级嗜热菌、11,700种嗜热菌、6267种嗜冷菌和67,037种嗜热菌组成的大型数据集上进行了构建和测试。实现和分析结果表明,基于K-NN的预测模型能够有效地区分四类,准确率较高,其中氨基酸组成模型在使用10倍交叉验证时准确率达到94%,使用保留检验时准确率达到98%。另一方面,基于伪氨基酸组成的模型通过hold out测试达到了99%的准确率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信