Machine Learning Applied to Predicting Microorganism Growth Temperatures and Enzyme Catalytic Optima

IF 3.9 2区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS
Gang Li, Kersten S. Rabe, Jens Nielsen, Martin K. M. Engqvist*
{"title":"Machine Learning Applied to Predicting Microorganism Growth Temperatures and Enzyme Catalytic Optima","authors":"Gang Li,&nbsp;Kersten S. Rabe,&nbsp;Jens Nielsen,&nbsp;Martin K. M. Engqvist*","doi":"10.1021/acssynbio.9b00099","DOIUrl":null,"url":null,"abstract":"<p >Enzymes that catalyze chemical reactions at high temperatures are used for industrial biocatalysis, applications in molecular biology, and as highly evolvable starting points for protein engineering. The optimal growth temperature (OGT) of organisms is commonly used to estimate the stability of enzymes encoded in their genomes, but the number of experimentally determined OGT values are limited, particularly for thermophilic organisms. Here, we report on the development of a machine learning model that can accurately predict OGT for bacteria, archaea, and microbial eukaryotes directly from their proteome-wide 2-mer amino acid composition. The trained model is made freely available for reuse. In a subsequent step we use OGT data in combination with amino acid composition of individual enzymes to develop a second machine learning model—for prediction of enzyme catalytic temperature optima (<i>T</i><sub>opt</sub>). The resulting model generates enzyme <i>T</i><sub>opt</sub> estimates that are far superior to using OGT alone. Finally, we predict <i>T</i><sub>opt</sub> for 6.5 million enzymes, covering 4447 enzyme classes, and make the resulting data set available to researchers. This work enables simple and rapid identification of enzymes that are potentially functional at extreme temperatures.</p>","PeriodicalId":26,"journal":{"name":"ACS Synthetic Biology","volume":"8 6","pages":"1411–1420"},"PeriodicalIF":3.9000,"publicationDate":"2019-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1021/acssynbio.9b00099","citationCount":"58","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Synthetic Biology","FirstCategoryId":"99","ListUrlMain":"https://pubs.acs.org/doi/10.1021/acssynbio.9b00099","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 58

Abstract

Enzymes that catalyze chemical reactions at high temperatures are used for industrial biocatalysis, applications in molecular biology, and as highly evolvable starting points for protein engineering. The optimal growth temperature (OGT) of organisms is commonly used to estimate the stability of enzymes encoded in their genomes, but the number of experimentally determined OGT values are limited, particularly for thermophilic organisms. Here, we report on the development of a machine learning model that can accurately predict OGT for bacteria, archaea, and microbial eukaryotes directly from their proteome-wide 2-mer amino acid composition. The trained model is made freely available for reuse. In a subsequent step we use OGT data in combination with amino acid composition of individual enzymes to develop a second machine learning model—for prediction of enzyme catalytic temperature optima (Topt). The resulting model generates enzyme Topt estimates that are far superior to using OGT alone. Finally, we predict Topt for 6.5 million enzymes, covering 4447 enzyme classes, and make the resulting data set available to researchers. This work enables simple and rapid identification of enzymes that are potentially functional at extreme temperatures.

Abstract Image

应用机器学习预测微生物生长温度和酶催化优化
在高温下催化化学反应的酶用于工业生物催化,在分子生物学中的应用,以及作为高度可进化的蛋白质工程的起点。生物的最佳生长温度(OGT)通常用于估计其基因组中编码酶的稳定性,但实验确定的OGT值数量有限,特别是对于嗜热生物。在这里,我们报告了一种机器学习模型的开发,该模型可以直接从细菌、古细菌和微生物真核生物的蛋白质组2-聚氨基酸组成中准确预测它们的OGT。训练好的模型可以免费重用。在接下来的步骤中,我们使用OGT数据与单个酶的氨基酸组成相结合来开发第二个机器学习模型-用于预测酶催化温度最优(Topt)。由此产生的模型产生的酶Topt估计远远优于单独使用OGT。最后,我们预测了650万种酶的Topt,涵盖了4447种酶类,并将结果数据集提供给研究人员。这项工作能够简单快速地鉴定在极端温度下可能起作用的酶。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
8.00
自引率
10.60%
发文量
380
审稿时长
6-12 weeks
期刊介绍: The journal is particularly interested in studies on the design and synthesis of new genetic circuits and gene products; computational methods in the design of systems; and integrative applied approaches to understanding disease and metabolism. Topics may include, but are not limited to: Design and optimization of genetic systems Genetic circuit design and their principles for their organization into programs Computational methods to aid the design of genetic systems Experimental methods to quantify genetic parts, circuits, and metabolic fluxes Genetic parts libraries: their creation, analysis, and ontological representation Protein engineering including computational design Metabolic engineering and cellular manufacturing, including biomass conversion Natural product access, engineering, and production Creative and innovative applications of cellular programming Medical applications, tissue engineering, and the programming of therapeutic cells Minimal cell design and construction Genomics and genome replacement strategies Viral engineering Automated and robotic assembly platforms for synthetic biology DNA synthesis methodologies Metagenomics and synthetic metagenomic analysis Bioinformatics applied to gene discovery, chemoinformatics, and pathway construction Gene optimization Methods for genome-scale measurements of transcription and metabolomics Systems biology and methods to integrate multiple data sources in vitro and cell-free synthetic biology and molecular programming Nucleic acid engineering.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信