De novo protein design with a denoising diffusion network independent of pretrained structure prediction models.

IF 36.1 1区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS
Yufeng Liu, Sheng Wang, Jixin Dong, Linghui Chen, Xinyu Wang, Lei Wang, Fudong Li, Chenchen Wang, Jiahai Zhang, Yuzhu Wang, Si Wei, Quan Chen, Haiyan Liu
{"title":"De novo protein design with a denoising diffusion network independent of pretrained structure prediction models.","authors":"Yufeng Liu, Sheng Wang, Jixin Dong, Linghui Chen, Xinyu Wang, Lei Wang, Fudong Li, Chenchen Wang, Jiahai Zhang, Yuzhu Wang, Si Wei, Quan Chen, Haiyan Liu","doi":"10.1038/s41592-024-02437-w","DOIUrl":null,"url":null,"abstract":"<p><p>The recent success of RFdiffusion, a method for protein structure design with a denoising diffusion probabilistic model, has relied on fine-tuning the RoseTTAFold structure prediction network for protein backbone denoising. Here, we introduce SCUBA-diffusion (SCUBA-D), a protein backbone denoising diffusion probabilistic model freshly trained by considering co-diffusion of sequence representation to enhance model regularization and adversarial losses to minimize data-out-of-distribution errors. While matching the performance of the pretrained RoseTTAFold-based RFdiffusion in generating experimentally realizable protein structures, SCUBA-D readily generates protein structures with not-yet-observed overall folds that are different from those predictable with RoseTTAFold. The accuracy of SCUBA-D was confirmed by the X-ray structures of 16 designed proteins and a protein complex, and by experiments validating designed heme-binding proteins and Ras-binding proteins. Our work shows that deep generative models of images or texts can be fruitfully extended to complex physical objects like protein structures by addressing outstanding issues such as the data-out-of-distribution errors.</p>","PeriodicalId":18981,"journal":{"name":"Nature Methods","volume":null,"pages":null},"PeriodicalIF":36.1000,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nature Methods","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1038/s41592-024-02437-w","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

The recent success of RFdiffusion, a method for protein structure design with a denoising diffusion probabilistic model, has relied on fine-tuning the RoseTTAFold structure prediction network for protein backbone denoising. Here, we introduce SCUBA-diffusion (SCUBA-D), a protein backbone denoising diffusion probabilistic model freshly trained by considering co-diffusion of sequence representation to enhance model regularization and adversarial losses to minimize data-out-of-distribution errors. While matching the performance of the pretrained RoseTTAFold-based RFdiffusion in generating experimentally realizable protein structures, SCUBA-D readily generates protein structures with not-yet-observed overall folds that are different from those predictable with RoseTTAFold. The accuracy of SCUBA-D was confirmed by the X-ray structures of 16 designed proteins and a protein complex, and by experiments validating designed heme-binding proteins and Ras-binding proteins. Our work shows that deep generative models of images or texts can be fruitfully extended to complex physical objects like protein structures by addressing outstanding issues such as the data-out-of-distribution errors.

利用去噪扩散网络进行全新蛋白质设计,不受预训练结构预测模型的影响。
RFdiffusion 是一种利用去噪扩散概率模型进行蛋白质结构设计的方法,它最近的成功依赖于对用于蛋白质骨架去噪的 RoseTTAFold 结构预测网络的微调。在这里,我们引入了SCUBA-diffusion(SCUBA-D),它是一种蛋白质骨架去噪扩散概率模型,通过考虑序列表示的共扩散来加强模型的正则化,并考虑对抗损失来最小化数据分布外误差,从而对其进行全新训练。在生成可通过实验实现的蛋白质结构方面,SCUBA-D 与基于 RoseTTAFold 的预训练 RFdiffusion 性能相当,但 SCUBA-D 很容易生成尚未观测到的整体折叠的蛋白质结构,这些折叠与 RoseTTAFold 预测的折叠不同。16 个设计的蛋白质和一个蛋白质复合物的 X 射线结构,以及验证设计的血红素结合蛋白和 Ras 结合蛋白的实验,证实了 SCUBA-D 的准确性。我们的工作表明,通过解决数据分布误差等悬而未决的问题,图像或文本的深度生成模型可以有效地扩展到蛋白质结构等复杂的物理对象。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Nature Methods
Nature Methods 生物-生化研究方法
CiteScore
58.70
自引率
1.70%
发文量
326
审稿时长
1 months
期刊介绍: Nature Methods is a monthly journal that focuses on publishing innovative methods and substantial enhancements to fundamental life sciences research techniques. Geared towards a diverse, interdisciplinary readership of researchers in academia and industry engaged in laboratory work, the journal offers new tools for research and emphasizes the immediate practical significance of the featured work. It publishes primary research papers and reviews recent technical and methodological advancements, with a particular interest in primary methods papers relevant to the biological and biomedical sciences. This includes methods rooted in chemistry with practical applications for studying biological problems.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信