Christopher Yin, Sebastian Castillo-Hair, Gun Woo Byeon, Peter Bromley, Wouter Meuleman, Georg Seelig
{"title":"人类增强子的迭代深度学习设计利用浓缩序列语法来实现细胞类型特异性。","authors":"Christopher Yin, Sebastian Castillo-Hair, Gun Woo Byeon, Peter Bromley, Wouter Meuleman, Georg Seelig","doi":"10.1016/j.cels.2025.101302","DOIUrl":null,"url":null,"abstract":"<p><p>An important and largely unsolved problem in synthetic biology is how to target gene expression to specific cell types. Here, we apply iterative deep learning to design synthetic enhancers with strong differential activity between two human cell lines. We initially train models on published datasets of enhancer activity and chromatin accessibility and use them to guide the design of synthetic enhancers that maximize predicted specificity. We experimentally validate these sequences, use the measurements to re-optimize the model, and design a second generation of enhancers with improved specificity. Our design methods embed relevant transcription factor binding site (TFBS) motifs with higher frequency than comparable endogenous enhancers while using a more selective motif vocabulary, and we show that enhancer activity is correlated with transcription factor expression at the single-cell level. Finally, we characterize causal features of top enhancers via perturbation experiments and show that enhancers as short as 50 bp can maintain specificity. A record of this paper's transparent peer review process is included in the supplemental information.</p>","PeriodicalId":93929,"journal":{"name":"Cell systems","volume":" ","pages":"101302"},"PeriodicalIF":0.0000,"publicationDate":"2025-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Iterative deep learning design of human enhancers exploits condensed sequence grammar to achieve cell-type specificity.\",\"authors\":\"Christopher Yin, Sebastian Castillo-Hair, Gun Woo Byeon, Peter Bromley, Wouter Meuleman, Georg Seelig\",\"doi\":\"10.1016/j.cels.2025.101302\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>An important and largely unsolved problem in synthetic biology is how to target gene expression to specific cell types. Here, we apply iterative deep learning to design synthetic enhancers with strong differential activity between two human cell lines. We initially train models on published datasets of enhancer activity and chromatin accessibility and use them to guide the design of synthetic enhancers that maximize predicted specificity. We experimentally validate these sequences, use the measurements to re-optimize the model, and design a second generation of enhancers with improved specificity. Our design methods embed relevant transcription factor binding site (TFBS) motifs with higher frequency than comparable endogenous enhancers while using a more selective motif vocabulary, and we show that enhancer activity is correlated with transcription factor expression at the single-cell level. Finally, we characterize causal features of top enhancers via perturbation experiments and show that enhancers as short as 50 bp can maintain specificity. A record of this paper's transparent peer review process is included in the supplemental information.</p>\",\"PeriodicalId\":93929,\"journal\":{\"name\":\"Cell systems\",\"volume\":\" \",\"pages\":\"101302\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-06-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Cell systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1016/j.cels.2025.101302\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cell systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1016/j.cels.2025.101302","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Iterative deep learning design of human enhancers exploits condensed sequence grammar to achieve cell-type specificity.
An important and largely unsolved problem in synthetic biology is how to target gene expression to specific cell types. Here, we apply iterative deep learning to design synthetic enhancers with strong differential activity between two human cell lines. We initially train models on published datasets of enhancer activity and chromatin accessibility and use them to guide the design of synthetic enhancers that maximize predicted specificity. We experimentally validate these sequences, use the measurements to re-optimize the model, and design a second generation of enhancers with improved specificity. Our design methods embed relevant transcription factor binding site (TFBS) motifs with higher frequency than comparable endogenous enhancers while using a more selective motif vocabulary, and we show that enhancer activity is correlated with transcription factor expression at the single-cell level. Finally, we characterize causal features of top enhancers via perturbation experiments and show that enhancers as short as 50 bp can maintain specificity. A record of this paper's transparent peer review process is included in the supplemental information.