自动表征和分析大肠杆菌中调控序列与代谢基因之间的表达兼容性

IF 4.4 2区 生物学 Q1 BIOTECHNOLOGY & APPLIED MICROBIOLOGY
Xiao Wen , Jiawei Lin , Chunhe Yang , Ying Li , Haijiao Cheng , Ye Liu , Yue Zhang , Hongwu Ma , Yufeng Mao , Xiaoping Liao , Meng Wang
{"title":"自动表征和分析大肠杆菌中调控序列与代谢基因之间的表达兼容性","authors":"Xiao Wen ,&nbsp;Jiawei Lin ,&nbsp;Chunhe Yang ,&nbsp;Ying Li ,&nbsp;Haijiao Cheng ,&nbsp;Ye Liu ,&nbsp;Yue Zhang ,&nbsp;Hongwu Ma ,&nbsp;Yufeng Mao ,&nbsp;Xiaoping Liao ,&nbsp;Meng Wang","doi":"10.1016/j.synbio.2024.05.010","DOIUrl":null,"url":null,"abstract":"<div><p>Utilizing standardized artificial regulatory sequences to fine-tuning the expression of multiple metabolic pathways/genes is a key strategy in the creation of efficient microbial cell factories. However, when regulatory sequence expression strengths are characterized using only a few reporter genes, they may not be applicable across diverse genes. This introduces great uncertainty into the precise regulation of multiple genes at multiple expression levels. To address this, our study adopted a fluorescent protein fusion strategy for a more accurate assessment of target protein expression levels. We combined 41 commonly-used metabolic genes with 15 regulatory sequences, yielding an expression dataset encompassing 520 unique combinations. This dataset highlighted substantial variation in protein expression level under identical regulatory sequences, with relative expression levels ranging from 2.8 to 176-fold. It also demonstrated that improving the strength of regulatory sequences does not necessarily lead to significant improvements in the expression levels of target proteins. Utilizing this dataset, we have developed various machine learning models and discovered that the integration of promoter regions, ribosome binding sites, and coding sequences significantly improves the accuracy of predicting protein expression levels, with a Spearman correlation coefficient of 0.72, where the promoter sequence exerts a predominant influence. Our study aims not only to provide a detailed guide for fine-tuning gene expression in the metabolic engineering of <em>Escherichia coli</em> but also to deepen our understanding of the compatibility issues between regulatory sequences and target genes.</p></div>","PeriodicalId":22148,"journal":{"name":"Synthetic and Systems Biotechnology","volume":"9 4","pages":"Pages 647-657"},"PeriodicalIF":4.4000,"publicationDate":"2024-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2405805X24000851/pdfft?md5=f535dd3094336720674eaf7d8d922be9&pid=1-s2.0-S2405805X24000851-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Automated characterization and analysis of expression compatibility between regulatory sequences and metabolic genes in Escherichia coli\",\"authors\":\"Xiao Wen ,&nbsp;Jiawei Lin ,&nbsp;Chunhe Yang ,&nbsp;Ying Li ,&nbsp;Haijiao Cheng ,&nbsp;Ye Liu ,&nbsp;Yue Zhang ,&nbsp;Hongwu Ma ,&nbsp;Yufeng Mao ,&nbsp;Xiaoping Liao ,&nbsp;Meng Wang\",\"doi\":\"10.1016/j.synbio.2024.05.010\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Utilizing standardized artificial regulatory sequences to fine-tuning the expression of multiple metabolic pathways/genes is a key strategy in the creation of efficient microbial cell factories. However, when regulatory sequence expression strengths are characterized using only a few reporter genes, they may not be applicable across diverse genes. This introduces great uncertainty into the precise regulation of multiple genes at multiple expression levels. To address this, our study adopted a fluorescent protein fusion strategy for a more accurate assessment of target protein expression levels. We combined 41 commonly-used metabolic genes with 15 regulatory sequences, yielding an expression dataset encompassing 520 unique combinations. This dataset highlighted substantial variation in protein expression level under identical regulatory sequences, with relative expression levels ranging from 2.8 to 176-fold. It also demonstrated that improving the strength of regulatory sequences does not necessarily lead to significant improvements in the expression levels of target proteins. Utilizing this dataset, we have developed various machine learning models and discovered that the integration of promoter regions, ribosome binding sites, and coding sequences significantly improves the accuracy of predicting protein expression levels, with a Spearman correlation coefficient of 0.72, where the promoter sequence exerts a predominant influence. Our study aims not only to provide a detailed guide for fine-tuning gene expression in the metabolic engineering of <em>Escherichia coli</em> but also to deepen our understanding of the compatibility issues between regulatory sequences and target genes.</p></div>\",\"PeriodicalId\":22148,\"journal\":{\"name\":\"Synthetic and Systems Biotechnology\",\"volume\":\"9 4\",\"pages\":\"Pages 647-657\"},\"PeriodicalIF\":4.4000,\"publicationDate\":\"2024-05-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2405805X24000851/pdfft?md5=f535dd3094336720674eaf7d8d922be9&pid=1-s2.0-S2405805X24000851-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Synthetic and Systems Biotechnology\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2405805X24000851\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOTECHNOLOGY & APPLIED MICROBIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Synthetic and Systems Biotechnology","FirstCategoryId":"99","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2405805X24000851","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOTECHNOLOGY & APPLIED MICROBIOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

利用标准化人工调控序列来微调多种代谢途径/基因的表达,是创建高效微生物细胞工厂的关键策略。然而,当仅使用少数报告基因来表征调控序列表达强度时,它们可能不适用于不同的基因。这给多个基因在多个表达水平上的精确调控带来了极大的不确定性。为了解决这个问题,我们的研究采用了荧光蛋白融合策略,以更准确地评估目标蛋白的表达水平。我们将 41 个常用的代谢基因与 15 个调控序列相结合,得到了一个包含 520 个独特组合的表达数据集。该数据集突出显示了相同调控序列下蛋白质表达水平的巨大差异,相对表达水平从 2.8 倍到 176 倍不等。它还表明,提高调控序列的强度并不一定能显著提高目标蛋白质的表达水平。利用这个数据集,我们开发了各种机器学习模型,发现整合启动子区域、核糖体结合位点和编码序列能显著提高预测蛋白质表达水平的准确性,斯皮尔曼相关系数为 0.72,其中启动子序列的影响最大。我们的研究不仅旨在为微调大肠杆菌代谢工程中的基因表达提供详细的指导,还旨在加深我们对调控序列与目标基因之间兼容性问题的理解。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Automated characterization and analysis of expression compatibility between regulatory sequences and metabolic genes in Escherichia coli

Utilizing standardized artificial regulatory sequences to fine-tuning the expression of multiple metabolic pathways/genes is a key strategy in the creation of efficient microbial cell factories. However, when regulatory sequence expression strengths are characterized using only a few reporter genes, they may not be applicable across diverse genes. This introduces great uncertainty into the precise regulation of multiple genes at multiple expression levels. To address this, our study adopted a fluorescent protein fusion strategy for a more accurate assessment of target protein expression levels. We combined 41 commonly-used metabolic genes with 15 regulatory sequences, yielding an expression dataset encompassing 520 unique combinations. This dataset highlighted substantial variation in protein expression level under identical regulatory sequences, with relative expression levels ranging from 2.8 to 176-fold. It also demonstrated that improving the strength of regulatory sequences does not necessarily lead to significant improvements in the expression levels of target proteins. Utilizing this dataset, we have developed various machine learning models and discovered that the integration of promoter regions, ribosome binding sites, and coding sequences significantly improves the accuracy of predicting protein expression levels, with a Spearman correlation coefficient of 0.72, where the promoter sequence exerts a predominant influence. Our study aims not only to provide a detailed guide for fine-tuning gene expression in the metabolic engineering of Escherichia coli but also to deepen our understanding of the compatibility issues between regulatory sequences and target genes.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Synthetic and Systems Biotechnology
Synthetic and Systems Biotechnology BIOTECHNOLOGY & APPLIED MICROBIOLOGY-
CiteScore
6.90
自引率
12.50%
发文量
90
审稿时长
67 days
期刊介绍: Synthetic and Systems Biotechnology aims to promote the communication of original research in synthetic and systems biology, with strong emphasis on applications towards biotechnology. This journal is a quarterly peer-reviewed journal led by Editor-in-Chief Lixin Zhang. The journal publishes high-quality research; focusing on integrative approaches to enable the understanding and design of biological systems, and research to develop the application of systems and synthetic biology to natural systems. This journal will publish Articles, Short notes, Methods, Mini Reviews, Commentary and Conference reviews.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信