深度学习引导的远端氨基酸残基挖掘和聚类,用于同时工程的过程内切葡聚糖酶的催化活性和热稳定性。

IF 3.9 2区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS
Mujunqi Wu, Yuzhen Huang, Xinyan He, Kequan Chen, Bin Wu, Gerhard Schenk
{"title":"深度学习引导的远端氨基酸残基挖掘和聚类,用于同时工程的过程内切葡聚糖酶的催化活性和热稳定性。","authors":"Mujunqi Wu, Yuzhen Huang, Xinyan He, Kequan Chen, Bin Wu, Gerhard Schenk","doi":"10.1021/acssynbio.5c00454","DOIUrl":null,"url":null,"abstract":"<p><p>Processive endoglucanases, which possess both endo- and exoglucanase activities, are considered highly promising catalysts in cellulose degradation. In this study, we employed multiple deep learning models, including MutCompute, DeepSequence, and ESM-1v, to guide the engineering of EG5C-1, a processive endoglucanase derived from <i>Bacillus subtilis</i> BS-5. This enabled a systematic exploration of the enzyme's sequence space. Through a combination of clustering analysis and a greedy algorithm, we optimized combinations of amino acid substitutions and ultimately identified an elite variant, M8 (R23Q/E43Q/K91I/K191P/A198T/Q237D/V240P/S245A), composed entirely of substituted residues. Compared to the wild-type enzyme, M8 exhibited 10-fold and 5-fold improvements in catalytic efficiency (<i>k</i><sub>cat</sub>/<i>K</i><sub>m</sub>) toward soluble substrate carboxymethyl cellulose-Na (CMC) and insoluble substrate phosphoric acid-swollen cellulose (PASC), respectively, along with enhanced optimal temperature and thermostability. Molecular mechanistic analyses revealed that all distal substituted residues enhanced dynamic coupling and coordination, primarily influencing the conformation of three loops near the substrate pocket. These structural changes modulated substrate binding and product release, thereby contributing to improved catalytic efficiency (<i>k</i><sub>cat</sub>/<i>K</i><sub>m</sub>). This work not only suggests a feasible strategy to explore the \"dark space\" within sequences but also provides insights into the practical application of machine learning in experiments.</p>","PeriodicalId":26,"journal":{"name":"ACS Synthetic Biology","volume":" ","pages":""},"PeriodicalIF":3.9000,"publicationDate":"2025-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Deep-Learning-Guided Mining and Clustering of Remote Amino Acid Residues for the Simultaneous Engineering of the Catalytic Activity and Thermostability of a Processive Endoglucanase.\",\"authors\":\"Mujunqi Wu, Yuzhen Huang, Xinyan He, Kequan Chen, Bin Wu, Gerhard Schenk\",\"doi\":\"10.1021/acssynbio.5c00454\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Processive endoglucanases, which possess both endo- and exoglucanase activities, are considered highly promising catalysts in cellulose degradation. In this study, we employed multiple deep learning models, including MutCompute, DeepSequence, and ESM-1v, to guide the engineering of EG5C-1, a processive endoglucanase derived from <i>Bacillus subtilis</i> BS-5. This enabled a systematic exploration of the enzyme's sequence space. Through a combination of clustering analysis and a greedy algorithm, we optimized combinations of amino acid substitutions and ultimately identified an elite variant, M8 (R23Q/E43Q/K91I/K191P/A198T/Q237D/V240P/S245A), composed entirely of substituted residues. Compared to the wild-type enzyme, M8 exhibited 10-fold and 5-fold improvements in catalytic efficiency (<i>k</i><sub>cat</sub>/<i>K</i><sub>m</sub>) toward soluble substrate carboxymethyl cellulose-Na (CMC) and insoluble substrate phosphoric acid-swollen cellulose (PASC), respectively, along with enhanced optimal temperature and thermostability. Molecular mechanistic analyses revealed that all distal substituted residues enhanced dynamic coupling and coordination, primarily influencing the conformation of three loops near the substrate pocket. These structural changes modulated substrate binding and product release, thereby contributing to improved catalytic efficiency (<i>k</i><sub>cat</sub>/<i>K</i><sub>m</sub>). This work not only suggests a feasible strategy to explore the \\\"dark space\\\" within sequences but also provides insights into the practical application of machine learning in experiments.</p>\",\"PeriodicalId\":26,\"journal\":{\"name\":\"ACS Synthetic Biology\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2025-10-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACS Synthetic Biology\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1021/acssynbio.5c00454\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Synthetic Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1021/acssynbio.5c00454","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0

摘要

隐性内切葡聚糖酶具有内切和外切葡聚糖酶活性,被认为是纤维素降解中极具前景的催化剂。在这项研究中,我们采用了多种深度学习模型,包括MutCompute、DeepSequence和ESM-1v,来指导EG5C-1的工程设计,EG5C-1是一种源自枯草芽孢杆菌BS-5的过程内切葡聚糖酶。这使得对酶序列空间的系统探索成为可能。通过聚类分析和贪心算法相结合,优化氨基酸取代组合,最终鉴定出一个完全由取代残基组成的精英变异M8 (R23Q/E43Q/K91I/K191P/A198T/Q237D/V240P/S245A)。与野生型酶相比,M8对可溶性底物羧甲基纤维素钠(CMC)和不溶性底物磷酸膨胀纤维素(PASC)的催化效率(kcat/Km)分别提高了10倍和5倍,同时提高了最佳温度和热稳定性。分子机制分析表明,所有远端取代残基增强了动态耦合和配位,主要影响底物袋附近三个环的构象。这些结构变化调节了底物结合和产物释放,从而有助于提高催化效率(kcat/Km)。这项工作不仅提出了一种探索序列中“黑暗空间”的可行策略,而且为机器学习在实验中的实际应用提供了见解。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Deep-Learning-Guided Mining and Clustering of Remote Amino Acid Residues for the Simultaneous Engineering of the Catalytic Activity and Thermostability of a Processive Endoglucanase.

Processive endoglucanases, which possess both endo- and exoglucanase activities, are considered highly promising catalysts in cellulose degradation. In this study, we employed multiple deep learning models, including MutCompute, DeepSequence, and ESM-1v, to guide the engineering of EG5C-1, a processive endoglucanase derived from Bacillus subtilis BS-5. This enabled a systematic exploration of the enzyme's sequence space. Through a combination of clustering analysis and a greedy algorithm, we optimized combinations of amino acid substitutions and ultimately identified an elite variant, M8 (R23Q/E43Q/K91I/K191P/A198T/Q237D/V240P/S245A), composed entirely of substituted residues. Compared to the wild-type enzyme, M8 exhibited 10-fold and 5-fold improvements in catalytic efficiency (kcat/Km) toward soluble substrate carboxymethyl cellulose-Na (CMC) and insoluble substrate phosphoric acid-swollen cellulose (PASC), respectively, along with enhanced optimal temperature and thermostability. Molecular mechanistic analyses revealed that all distal substituted residues enhanced dynamic coupling and coordination, primarily influencing the conformation of three loops near the substrate pocket. These structural changes modulated substrate binding and product release, thereby contributing to improved catalytic efficiency (kcat/Km). This work not only suggests a feasible strategy to explore the "dark space" within sequences but also provides insights into the practical application of machine learning in experiments.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
8.00
自引率
10.60%
发文量
380
审稿时长
6-12 weeks
期刊介绍: The journal is particularly interested in studies on the design and synthesis of new genetic circuits and gene products; computational methods in the design of systems; and integrative applied approaches to understanding disease and metabolism. Topics may include, but are not limited to: Design and optimization of genetic systems Genetic circuit design and their principles for their organization into programs Computational methods to aid the design of genetic systems Experimental methods to quantify genetic parts, circuits, and metabolic fluxes Genetic parts libraries: their creation, analysis, and ontological representation Protein engineering including computational design Metabolic engineering and cellular manufacturing, including biomass conversion Natural product access, engineering, and production Creative and innovative applications of cellular programming Medical applications, tissue engineering, and the programming of therapeutic cells Minimal cell design and construction Genomics and genome replacement strategies Viral engineering Automated and robotic assembly platforms for synthetic biology DNA synthesis methodologies Metagenomics and synthetic metagenomic analysis Bioinformatics applied to gene discovery, chemoinformatics, and pathway construction Gene optimization Methods for genome-scale measurements of transcription and metabolomics Systems biology and methods to integrate multiple data sources in vitro and cell-free synthetic biology and molecular programming Nucleic acid engineering.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信