蛋白质属性有助于光晕稳定性,生物信息学方法。

Esmaeil Ebrahimie, Mansour Ebrahimi, Narjes Rahpayma Sarvestani, Mahdi Ebrahimi
{"title":"蛋白质属性有助于光晕稳定性,生物信息学方法。","authors":"Esmaeil Ebrahimie,&nbsp;Mansour Ebrahimi,&nbsp;Narjes Rahpayma Sarvestani,&nbsp;Mahdi Ebrahimi","doi":"10.1186/1746-1448-7-1","DOIUrl":null,"url":null,"abstract":"<p><p> Halophile proteins can tolerate high salt concentrations. Understanding halophilicity features is the first step toward engineering halostable crops. To this end, we examined protein features contributing to the halo-toleration of halophilic organisms. We compared more than 850 features for halophilic and non-halophilic proteins with various screening, clustering, decision tree, and generalized rule induction models to search for patterns that code for halo-toleration. Up to 251 protein attributes selected by various attribute weighting algorithms as important features contribute to halo-stability; from them 14 attributes selected by 90% of models and the count of hydrogen gained the highest value (1.0) in 70% of attribute weighting models, showing the importance of this attribute in feature selection modeling. The other attributes mostly were the frequencies of di-peptides. No changes were found in the numbers of groups when K-Means and TwoStep clustering modeling were performed on datasets with or without feature selection filtering. Although the depths of induced trees were not high, the accuracies of trees were higher than 94% and the frequency of hydrophobic residues pointed as the most important feature to build trees. The performance evaluation of decision tree models had the same values and the best correctness percentage recorded with the Exhaustive CHAID and CHAID models. We did not find any significant difference in the percent of correctness, performance evaluation, and mean correctness of various decision tree models with or without feature selection. For the first time, we analyzed the performance of different screening, clustering, and decision tree algorithms for discriminating halophilic and non-halophilic proteins and the results showed that amino acid composition can be used to discriminate between halo-tolerant and halo-sensitive proteins.</p>","PeriodicalId":87359,"journal":{"name":"Saline systems","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2011-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/1746-1448-7-1","citationCount":"58","resultStr":"{\"title\":\"Protein attributes contribute to halo-stability, bioinformatics approach.\",\"authors\":\"Esmaeil Ebrahimie,&nbsp;Mansour Ebrahimi,&nbsp;Narjes Rahpayma Sarvestani,&nbsp;Mahdi Ebrahimi\",\"doi\":\"10.1186/1746-1448-7-1\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p> Halophile proteins can tolerate high salt concentrations. Understanding halophilicity features is the first step toward engineering halostable crops. To this end, we examined protein features contributing to the halo-toleration of halophilic organisms. We compared more than 850 features for halophilic and non-halophilic proteins with various screening, clustering, decision tree, and generalized rule induction models to search for patterns that code for halo-toleration. Up to 251 protein attributes selected by various attribute weighting algorithms as important features contribute to halo-stability; from them 14 attributes selected by 90% of models and the count of hydrogen gained the highest value (1.0) in 70% of attribute weighting models, showing the importance of this attribute in feature selection modeling. The other attributes mostly were the frequencies of di-peptides. No changes were found in the numbers of groups when K-Means and TwoStep clustering modeling were performed on datasets with or without feature selection filtering. Although the depths of induced trees were not high, the accuracies of trees were higher than 94% and the frequency of hydrophobic residues pointed as the most important feature to build trees. The performance evaluation of decision tree models had the same values and the best correctness percentage recorded with the Exhaustive CHAID and CHAID models. We did not find any significant difference in the percent of correctness, performance evaluation, and mean correctness of various decision tree models with or without feature selection. For the first time, we analyzed the performance of different screening, clustering, and decision tree algorithms for discriminating halophilic and non-halophilic proteins and the results showed that amino acid composition can be used to discriminate between halo-tolerant and halo-sensitive proteins.</p>\",\"PeriodicalId\":87359,\"journal\":{\"name\":\"Saline systems\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-05-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1186/1746-1448-7-1\",\"citationCount\":\"58\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Saline systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1186/1746-1448-7-1\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Saline systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1186/1746-1448-7-1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 58

摘要

嗜盐蛋白可以耐受高浓度的盐。了解嗜盐性特征是设计耐盐作物的第一步。为此,我们研究了嗜盐生物耐光性的蛋白质特征。我们用各种筛选、聚类、决策树和广义规则归纳模型比较了850多个嗜盐和非嗜盐蛋白的特征,以寻找编码耐盐的模式。通过各种属性加权算法选择多达251个蛋白质属性作为重要特征,有助于halo-stability;其中,90%的模型选择了14个属性,氢的计数在70%的属性加权模型中获得了最大值(1.0),显示了该属性在特征选择建模中的重要性。其他属性主要是二肽的频率。对数据集进行K-Means和TwoStep聚类建模时,无论是否进行特征选择过滤,组数都没有变化。虽然诱导树的深度不高,但树的精度高于94%,疏水残基的频率是构建树的最重要特征。决策树模型的性能评价值与穷举式CHAID和CHAID模型相同,且准确率最高。我们没有发现有或没有特征选择的各种决策树模型的正确性百分比、性能评估和平均正确性有任何显著差异。我们首次分析了不同筛选、聚类和决策树算法在区分嗜盐蛋白和非嗜盐蛋白方面的性能,结果表明氨基酸组成可以用来区分嗜盐蛋白和嗜盐蛋白。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Protein attributes contribute to halo-stability, bioinformatics approach.

Protein attributes contribute to halo-stability, bioinformatics approach.

Halophile proteins can tolerate high salt concentrations. Understanding halophilicity features is the first step toward engineering halostable crops. To this end, we examined protein features contributing to the halo-toleration of halophilic organisms. We compared more than 850 features for halophilic and non-halophilic proteins with various screening, clustering, decision tree, and generalized rule induction models to search for patterns that code for halo-toleration. Up to 251 protein attributes selected by various attribute weighting algorithms as important features contribute to halo-stability; from them 14 attributes selected by 90% of models and the count of hydrogen gained the highest value (1.0) in 70% of attribute weighting models, showing the importance of this attribute in feature selection modeling. The other attributes mostly were the frequencies of di-peptides. No changes were found in the numbers of groups when K-Means and TwoStep clustering modeling were performed on datasets with or without feature selection filtering. Although the depths of induced trees were not high, the accuracies of trees were higher than 94% and the frequency of hydrophobic residues pointed as the most important feature to build trees. The performance evaluation of decision tree models had the same values and the best correctness percentage recorded with the Exhaustive CHAID and CHAID models. We did not find any significant difference in the percent of correctness, performance evaluation, and mean correctness of various decision tree models with or without feature selection. For the first time, we analyzed the performance of different screening, clustering, and decision tree algorithms for discriminating halophilic and non-halophilic proteins and the results showed that amino acid composition can be used to discriminate between halo-tolerant and halo-sensitive proteins.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信