{"title":"蛋白质语言模型辅助优化尿嘧啶-N-糖基化酶变体,实现可编程的 T-G 和 T-C 碱基编辑。","authors":"Yan He, Xibin Zhou, Chong Chang, Ge Chen, Weikuan Liu, Geng Li, Xiaoqi Fan, Mingsun Sun, Chensi Miao, Qianyue Huang, Yunqing Ma, Fajie Yuan, Xing Chang","doi":"10.1016/j.molcel.2024.01.021","DOIUrl":null,"url":null,"abstract":"<p><p>Current base editors (BEs) use DNA deaminases, including cytidine deaminase in cytidine BE (CBE) or adenine deaminase in adenine BE (ABE), to facilitate transition nucleotide substitutions. Combining CBE or ABE with glycosylase enzymes can induce limited transversion mutations. Nonetheless, a critical demand remains for BEs capable of generating alternative mutation types, such as T>G corrections. In this study, we leveraged pre-trained protein language models to optimize a uracil-N-glycosylase (UNG) variant with altered specificity for thymines (eTDG). Notably, after two rounds of testing fewer than 50 top-ranking variants, more than 50% exhibited over 1.5-fold enhancement in enzymatic activities. When eTDG was fused with nCas9, it induced programmable T-to-S (G/C) substitutions and corrected db/db diabetic mutation in mice (up to 55%). Our findings not only establish orthogonal strategies for developing novel BEs but also demonstrate the capacities of protein language models for optimizing enzymes without extensive task-specific training data.</p>","PeriodicalId":18950,"journal":{"name":"Molecular Cell","volume":" ","pages":"1257-1270.e6"},"PeriodicalIF":16.6000,"publicationDate":"2024-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Protein language models-assisted optimization of a uracil-N-glycosylase variant enables programmable T-to-G and T-to-C base editing.\",\"authors\":\"Yan He, Xibin Zhou, Chong Chang, Ge Chen, Weikuan Liu, Geng Li, Xiaoqi Fan, Mingsun Sun, Chensi Miao, Qianyue Huang, Yunqing Ma, Fajie Yuan, Xing Chang\",\"doi\":\"10.1016/j.molcel.2024.01.021\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Current base editors (BEs) use DNA deaminases, including cytidine deaminase in cytidine BE (CBE) or adenine deaminase in adenine BE (ABE), to facilitate transition nucleotide substitutions. Combining CBE or ABE with glycosylase enzymes can induce limited transversion mutations. Nonetheless, a critical demand remains for BEs capable of generating alternative mutation types, such as T>G corrections. In this study, we leveraged pre-trained protein language models to optimize a uracil-N-glycosylase (UNG) variant with altered specificity for thymines (eTDG). Notably, after two rounds of testing fewer than 50 top-ranking variants, more than 50% exhibited over 1.5-fold enhancement in enzymatic activities. When eTDG was fused with nCas9, it induced programmable T-to-S (G/C) substitutions and corrected db/db diabetic mutation in mice (up to 55%). Our findings not only establish orthogonal strategies for developing novel BEs but also demonstrate the capacities of protein language models for optimizing enzymes without extensive task-specific training data.</p>\",\"PeriodicalId\":18950,\"journal\":{\"name\":\"Molecular Cell\",\"volume\":\" \",\"pages\":\"1257-1270.e6\"},\"PeriodicalIF\":16.6000,\"publicationDate\":\"2024-04-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Molecular Cell\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1016/j.molcel.2024.01.021\",\"RegionNum\":1,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/2/19 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"BIOCHEMISTRY & MOLECULAR BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Molecular Cell","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1016/j.molcel.2024.01.021","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/2/19 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0
摘要
目前的碱基编辑器(BE)使用 DNA 脱氨酶,包括胞苷碱基编辑器(CBE)中的胞苷脱氨酶或腺嘌呤碱基编辑器(ABE)中的腺嘌呤脱氨酶,来促进过渡核苷酸的替换。将 CBE 或 ABE 与糖基化酶结合可诱导有限的转换突变。尽管如此,对能够产生替代突变类型(如 T>G 校正)的 BE 的需求仍然十分迫切。在这项研究中,我们利用预先训练好的蛋白质语言模型,优化了一种对甲状腺素特异性改变的尿嘧啶-N-糖基化酶(UNG)变体(eTDG)。值得注意的是,在对不到 50 个排名靠前的变体进行两轮测试后,50% 以上的变体的酶活性提高了 1.5 倍以上。当 eTDG 与 nCas9 融合时,它能诱导可编程的 T-to-S(G/C)置换,并纠正小鼠的 db/db 糖尿病突变(高达 55%)。我们的发现不仅建立了开发新型 BE 的正交策略,还证明了蛋白质语言模型在没有大量特定任务训练数据的情况下优化酶的能力。
Protein language models-assisted optimization of a uracil-N-glycosylase variant enables programmable T-to-G and T-to-C base editing.
Current base editors (BEs) use DNA deaminases, including cytidine deaminase in cytidine BE (CBE) or adenine deaminase in adenine BE (ABE), to facilitate transition nucleotide substitutions. Combining CBE or ABE with glycosylase enzymes can induce limited transversion mutations. Nonetheless, a critical demand remains for BEs capable of generating alternative mutation types, such as T>G corrections. In this study, we leveraged pre-trained protein language models to optimize a uracil-N-glycosylase (UNG) variant with altered specificity for thymines (eTDG). Notably, after two rounds of testing fewer than 50 top-ranking variants, more than 50% exhibited over 1.5-fold enhancement in enzymatic activities. When eTDG was fused with nCas9, it induced programmable T-to-S (G/C) substitutions and corrected db/db diabetic mutation in mice (up to 55%). Our findings not only establish orthogonal strategies for developing novel BEs but also demonstrate the capacities of protein language models for optimizing enzymes without extensive task-specific training data.
期刊介绍:
Molecular Cell is a companion to Cell, the leading journal of biology and the highest-impact journal in the world. Launched in December 1997 and published monthly. Molecular Cell is dedicated to publishing cutting-edge research in molecular biology, focusing on fundamental cellular processes. The journal encompasses a wide range of topics, including DNA replication, recombination, and repair; Chromatin biology and genome organization; Transcription; RNA processing and decay; Non-coding RNA function; Translation; Protein folding, modification, and quality control; Signal transduction pathways; Cell cycle and checkpoints; Cell death; Autophagy; Metabolism.