{"title":"Deep Learning-Assisted Design of Novel Promoters in Escherichia coli","authors":"Xinglong Wang, Kangjie Xu, Yameng Tan, Shangyang Yu, Xinyi Zhao, Jingwen Zhou","doi":"10.1002/ggn2.202300184","DOIUrl":null,"url":null,"abstract":"<p>Deep learning (DL) approaches have the ability to accurately recognize promoter regions and predict their strength. Here, the potential for controllably designing active <i>Escherichia coli</i> promoter is explored by combining multiple deep learning models. First, “DRSAdesign,” which relies on a diffusion model to generate different types of novel promoters is created, followed by predicting whether they are real or fake and strength. Experimental validation showed that 45 out of 50 generated promoters are active with high diversity, but most promoters have relatively low activity. Next, “Ndesign,” which relies on generating random sequences carrying functional −35 and −10 motifs of the sigma70 promoter is introduced, and their strength is predicted using the designed DL model. The DL model is trained and validated using 200 and 50 generated promoters, and displays Pearson correlation coefficients of 0.49 and 0.43, respectively. Taking advantage of the DL models developed in this work, possible 6-mers are predicted as key functional motifs of the sigma70 promoter, suggesting that promoter recognition and strength prediction mainly rely on the accommodation of functional motifs. This work provides DL tools to design promoters and assess their functions, paving the way for DL-assisted metabolic engineering.</p>","PeriodicalId":72071,"journal":{"name":"Advanced genetics (Hoboken, N.J.)","volume":"4 4","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ggn2.202300184","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advanced genetics (Hoboken, N.J.)","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/ggn2.202300184","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Deep learning (DL) approaches have the ability to accurately recognize promoter regions and predict their strength. Here, the potential for controllably designing active Escherichia coli promoter is explored by combining multiple deep learning models. First, “DRSAdesign,” which relies on a diffusion model to generate different types of novel promoters is created, followed by predicting whether they are real or fake and strength. Experimental validation showed that 45 out of 50 generated promoters are active with high diversity, but most promoters have relatively low activity. Next, “Ndesign,” which relies on generating random sequences carrying functional −35 and −10 motifs of the sigma70 promoter is introduced, and their strength is predicted using the designed DL model. The DL model is trained and validated using 200 and 50 generated promoters, and displays Pearson correlation coefficients of 0.49 and 0.43, respectively. Taking advantage of the DL models developed in this work, possible 6-mers are predicted as key functional motifs of the sigma70 promoter, suggesting that promoter recognition and strength prediction mainly rely on the accommodation of functional motifs. This work provides DL tools to design promoters and assess their functions, paving the way for DL-assisted metabolic engineering.