{"title":"基于耦合预测与生成模型的大肠杆菌终止器智能设计","authors":"Jie Li, Lin-Feng Wu, Kai Liu and Bin-Guang Ma*, ","doi":"10.1021/acssynbio.5c00429","DOIUrl":null,"url":null,"abstract":"<p >Terminators are specific nucleotide sequences located at the 3′ end of a gene and contain transcription termination information. As a fundamental genetic regulatory element, terminators play a crucial role in the design of gene circuits. Accurately characterizing terminator strength is essential for improving the precision of gene circuit designs. Experimental characterization of terminator strength is time-consuming and labor-intensive; therefore, there is a need to develop computational tools capable of accurately predicting terminator strength. Current prediction methods do not fully consider sequence or thermodynamic information related to terminators, lacking robust models for accurate prediction. Meanwhile, deep generative models have demonstrated tremendous potential in the design of biological sequences and are expected to be applied to terminator sequence design. This study focuses on intelligent design of <i>Escherichia coli</i> terminators and primarily conducts the following research: (1) to construct an intrinsic terminator strength prediction model for <i>E. coli</i>, this study extracts sequence features and thermodynamic features from <i>E. coli</i> intrinsic terminators. Machine learning models based on the selected features achieved a prediction performance of <i>R</i><sup>2</sup> = 0.72. (2) This study employs a generative adversarial network (GAN) to learn from intrinsic terminator sequence training data and generate terminator sequences. Evaluation reveals that the generated terminators exhibit similar data distributions to intrinsic terminators, demonstrating the reliability of GAN-generated terminator sequences. (3) This study uses the constructed terminator strength prediction model to screen for strong terminators from the generated set. Experimental verification shows that among the 18 selected terminators, 72% exhibit termination efficiencies greater than 90%, confirming the reliability of the intelligent design approach for <i>E. coli</i> terminators. In sum, this study constructs a terminator strength prediction model and a terminator generation model for <i>E. coli</i>, providing model support for terminator design in gene circuits. This enhances the modularity of biological component design and promotes the development of synthetic biology.</p>","PeriodicalId":26,"journal":{"name":"ACS Synthetic Biology","volume":"14 9","pages":"3744–3752"},"PeriodicalIF":3.9000,"publicationDate":"2025-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Intelligent Design of Escherichia coli Terminators by Coupling Prediction and Generation Models\",\"authors\":\"Jie Li, Lin-Feng Wu, Kai Liu and Bin-Guang Ma*, \",\"doi\":\"10.1021/acssynbio.5c00429\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p >Terminators are specific nucleotide sequences located at the 3′ end of a gene and contain transcription termination information. As a fundamental genetic regulatory element, terminators play a crucial role in the design of gene circuits. Accurately characterizing terminator strength is essential for improving the precision of gene circuit designs. Experimental characterization of terminator strength is time-consuming and labor-intensive; therefore, there is a need to develop computational tools capable of accurately predicting terminator strength. Current prediction methods do not fully consider sequence or thermodynamic information related to terminators, lacking robust models for accurate prediction. Meanwhile, deep generative models have demonstrated tremendous potential in the design of biological sequences and are expected to be applied to terminator sequence design. This study focuses on intelligent design of <i>Escherichia coli</i> terminators and primarily conducts the following research: (1) to construct an intrinsic terminator strength prediction model for <i>E. coli</i>, this study extracts sequence features and thermodynamic features from <i>E. coli</i> intrinsic terminators. Machine learning models based on the selected features achieved a prediction performance of <i>R</i><sup>2</sup> = 0.72. (2) This study employs a generative adversarial network (GAN) to learn from intrinsic terminator sequence training data and generate terminator sequences. Evaluation reveals that the generated terminators exhibit similar data distributions to intrinsic terminators, demonstrating the reliability of GAN-generated terminator sequences. (3) This study uses the constructed terminator strength prediction model to screen for strong terminators from the generated set. Experimental verification shows that among the 18 selected terminators, 72% exhibit termination efficiencies greater than 90%, confirming the reliability of the intelligent design approach for <i>E. coli</i> terminators. In sum, this study constructs a terminator strength prediction model and a terminator generation model for <i>E. coli</i>, providing model support for terminator design in gene circuits. This enhances the modularity of biological component design and promotes the development of synthetic biology.</p>\",\"PeriodicalId\":26,\"journal\":{\"name\":\"ACS Synthetic Biology\",\"volume\":\"14 9\",\"pages\":\"3744–3752\"},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2025-09-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACS Synthetic Biology\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://pubs.acs.org/doi/10.1021/acssynbio.5c00429\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Synthetic Biology","FirstCategoryId":"99","ListUrlMain":"https://pubs.acs.org/doi/10.1021/acssynbio.5c00429","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
Intelligent Design of Escherichia coli Terminators by Coupling Prediction and Generation Models
Terminators are specific nucleotide sequences located at the 3′ end of a gene and contain transcription termination information. As a fundamental genetic regulatory element, terminators play a crucial role in the design of gene circuits. Accurately characterizing terminator strength is essential for improving the precision of gene circuit designs. Experimental characterization of terminator strength is time-consuming and labor-intensive; therefore, there is a need to develop computational tools capable of accurately predicting terminator strength. Current prediction methods do not fully consider sequence or thermodynamic information related to terminators, lacking robust models for accurate prediction. Meanwhile, deep generative models have demonstrated tremendous potential in the design of biological sequences and are expected to be applied to terminator sequence design. This study focuses on intelligent design of Escherichia coli terminators and primarily conducts the following research: (1) to construct an intrinsic terminator strength prediction model for E. coli, this study extracts sequence features and thermodynamic features from E. coli intrinsic terminators. Machine learning models based on the selected features achieved a prediction performance of R2 = 0.72. (2) This study employs a generative adversarial network (GAN) to learn from intrinsic terminator sequence training data and generate terminator sequences. Evaluation reveals that the generated terminators exhibit similar data distributions to intrinsic terminators, demonstrating the reliability of GAN-generated terminator sequences. (3) This study uses the constructed terminator strength prediction model to screen for strong terminators from the generated set. Experimental verification shows that among the 18 selected terminators, 72% exhibit termination efficiencies greater than 90%, confirming the reliability of the intelligent design approach for E. coli terminators. In sum, this study constructs a terminator strength prediction model and a terminator generation model for E. coli, providing model support for terminator design in gene circuits. This enhances the modularity of biological component design and promotes the development of synthetic biology.
期刊介绍:
The journal is particularly interested in studies on the design and synthesis of new genetic circuits and gene products; computational methods in the design of systems; and integrative applied approaches to understanding disease and metabolism.
Topics may include, but are not limited to:
Design and optimization of genetic systems
Genetic circuit design and their principles for their organization into programs
Computational methods to aid the design of genetic systems
Experimental methods to quantify genetic parts, circuits, and metabolic fluxes
Genetic parts libraries: their creation, analysis, and ontological representation
Protein engineering including computational design
Metabolic engineering and cellular manufacturing, including biomass conversion
Natural product access, engineering, and production
Creative and innovative applications of cellular programming
Medical applications, tissue engineering, and the programming of therapeutic cells
Minimal cell design and construction
Genomics and genome replacement strategies
Viral engineering
Automated and robotic assembly platforms for synthetic biology
DNA synthesis methodologies
Metagenomics and synthetic metagenomic analysis
Bioinformatics applied to gene discovery, chemoinformatics, and pathway construction
Gene optimization
Methods for genome-scale measurements of transcription and metabolomics
Systems biology and methods to integrate multiple data sources
in vitro and cell-free synthetic biology and molecular programming
Nucleic acid engineering.