Multistate and functional protein design using RoseTTAFold sequence space diffusion

IF 33.1 1区 生物学 Q1 BIOTECHNOLOGY & APPLIED MICROBIOLOGY
Sidney Lyayuga Lisanza, Jacob Merle Gershon, Samuel W. K. Tipps, Jeremiah Nelson Sims, Lucas Arnoldt, Samuel J. Hendel, Miriam K. Simma, Ge Liu, Muna Yase, Hongwei Wu, Claire D. Tharp, Xinting Li, Alex Kang, Evans Brackenbrough, Asim K. Bera, Stacey Gerben, Bruce J. Wittmann, Andrew C. McShan, David Baker
{"title":"Multistate and functional protein design using RoseTTAFold sequence space diffusion","authors":"Sidney Lyayuga Lisanza, Jacob Merle Gershon, Samuel W. K. Tipps, Jeremiah Nelson Sims, Lucas Arnoldt, Samuel J. Hendel, Miriam K. Simma, Ge Liu, Muna Yase, Hongwei Wu, Claire D. Tharp, Xinting Li, Alex Kang, Evans Brackenbrough, Asim K. Bera, Stacey Gerben, Bruce J. Wittmann, Andrew C. McShan, David Baker","doi":"10.1038/s41587-024-02395-w","DOIUrl":null,"url":null,"abstract":"<p>Protein denoising diffusion probabilistic models are used for the de novo generation of protein backbones but are limited in their ability to guide generation of proteins with sequence-specific attributes and functional properties. To overcome this limitation, we developed ProteinGenerator (PG), a sequence space diffusion model based on RoseTTAFold that simultaneously generates protein sequences and structures. Beginning from a noised sequence representation, PG generates sequence and structure pairs by iterative denoising, guided by desired sequence and structural protein attributes. We designed thermostable proteins with varying amino acid compositions and internal sequence repeats and cage bioactive peptides, such as melittin. By averaging sequence logits between diffusion trajectories with distinct structural constraints, we designed multistate parent–child protein triples in which the same sequence folds to different supersecondary structures when intact in the parent versus split into two child domains. PG design trajectories can be guided by experimental sequence–activity data, providing a general approach for integrated computational and experimental optimization of protein function.</p>","PeriodicalId":19084,"journal":{"name":"Nature biotechnology","volume":null,"pages":null},"PeriodicalIF":33.1000,"publicationDate":"2024-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nature biotechnology","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1038/s41587-024-02395-w","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOTECHNOLOGY & APPLIED MICROBIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Protein denoising diffusion probabilistic models are used for the de novo generation of protein backbones but are limited in their ability to guide generation of proteins with sequence-specific attributes and functional properties. To overcome this limitation, we developed ProteinGenerator (PG), a sequence space diffusion model based on RoseTTAFold that simultaneously generates protein sequences and structures. Beginning from a noised sequence representation, PG generates sequence and structure pairs by iterative denoising, guided by desired sequence and structural protein attributes. We designed thermostable proteins with varying amino acid compositions and internal sequence repeats and cage bioactive peptides, such as melittin. By averaging sequence logits between diffusion trajectories with distinct structural constraints, we designed multistate parent–child protein triples in which the same sequence folds to different supersecondary structures when intact in the parent versus split into two child domains. PG design trajectories can be guided by experimental sequence–activity data, providing a general approach for integrated computational and experimental optimization of protein function.

Abstract Image

利用 RoseTTAFold 序列空间扩散进行多态和功能蛋白质设计
蛋白质去噪扩散概率模型可用于从头生成蛋白质骨架,但在指导生成具有特定序列属性和功能特性的蛋白质方面能力有限。为了克服这一局限,我们开发了 ProteinGenerator (PG),这是一种基于 RoseTTAFold 的序列空间扩散模型,可同时生成蛋白质序列和结构。PG 从噪声序列表示开始,在所需序列和结构蛋白质属性的指导下,通过迭代去噪生成序列和结构对。我们设计了具有不同氨基酸组成和内部序列重复的恒温蛋白质和笼状生物活性肽,如 Melittin。通过平均具有不同结构约束的扩散轨迹之间的序列对数,我们设计出了多态父子蛋白质三元组,其中相同的序列在完整的父域和分裂成两个子域时会折叠成不同的超二级结构。PG 设计轨迹可以由实验序列-活性数据指导,为蛋白质功能的综合计算和实验优化提供了一种通用方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Nature biotechnology
Nature biotechnology 工程技术-生物工程与应用微生物
CiteScore
63.00
自引率
1.70%
发文量
382
审稿时长
3 months
期刊介绍: Nature Biotechnology is a monthly journal that focuses on the science and business of biotechnology. It covers a wide range of topics including technology/methodology advancements in the biological, biomedical, agricultural, and environmental sciences. The journal also explores the commercial, political, ethical, legal, and societal aspects of this research. The journal serves researchers by providing peer-reviewed research papers in the field of biotechnology. It also serves the business community by delivering news about research developments. This approach ensures that both the scientific and business communities are well-informed and able to stay up-to-date on the latest advancements and opportunities in the field. Some key areas of interest in which the journal actively seeks research papers include molecular engineering of nucleic acids and proteins, molecular therapy, large-scale biology, computational biology, regenerative medicine, imaging technology, analytical biotechnology, applied immunology, food and agricultural biotechnology, and environmental biotechnology. In summary, Nature Biotechnology is a comprehensive journal that covers both the scientific and business aspects of biotechnology. It strives to provide researchers with valuable research papers and news while also delivering important scientific advancements to the business community.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信