Yufeng Liu, Sheng Wang, Jixin Dong, Linghui Chen, Xinyu Wang, Lei Wang, Fudong Li, Chenchen Wang, Jiahai Zhang, Yuzhu Wang, Si Wei, Quan Chen, Haiyan Liu
{"title":"De novo protein design with a denoising diffusion network independent of pretrained structure prediction models.","authors":"Yufeng Liu, Sheng Wang, Jixin Dong, Linghui Chen, Xinyu Wang, Lei Wang, Fudong Li, Chenchen Wang, Jiahai Zhang, Yuzhu Wang, Si Wei, Quan Chen, Haiyan Liu","doi":"10.1038/s41592-024-02437-w","DOIUrl":null,"url":null,"abstract":"<p><p>The recent success of RFdiffusion, a method for protein structure design with a denoising diffusion probabilistic model, has relied on fine-tuning the RoseTTAFold structure prediction network for protein backbone denoising. Here, we introduce SCUBA-diffusion (SCUBA-D), a protein backbone denoising diffusion probabilistic model freshly trained by considering co-diffusion of sequence representation to enhance model regularization and adversarial losses to minimize data-out-of-distribution errors. While matching the performance of the pretrained RoseTTAFold-based RFdiffusion in generating experimentally realizable protein structures, SCUBA-D readily generates protein structures with not-yet-observed overall folds that are different from those predictable with RoseTTAFold. The accuracy of SCUBA-D was confirmed by the X-ray structures of 16 designed proteins and a protein complex, and by experiments validating designed heme-binding proteins and Ras-binding proteins. Our work shows that deep generative models of images or texts can be fruitfully extended to complex physical objects like protein structures by addressing outstanding issues such as the data-out-of-distribution errors.</p>","PeriodicalId":18981,"journal":{"name":"Nature Methods","volume":null,"pages":null},"PeriodicalIF":36.1000,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nature Methods","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1038/s41592-024-02437-w","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
The recent success of RFdiffusion, a method for protein structure design with a denoising diffusion probabilistic model, has relied on fine-tuning the RoseTTAFold structure prediction network for protein backbone denoising. Here, we introduce SCUBA-diffusion (SCUBA-D), a protein backbone denoising diffusion probabilistic model freshly trained by considering co-diffusion of sequence representation to enhance model regularization and adversarial losses to minimize data-out-of-distribution errors. While matching the performance of the pretrained RoseTTAFold-based RFdiffusion in generating experimentally realizable protein structures, SCUBA-D readily generates protein structures with not-yet-observed overall folds that are different from those predictable with RoseTTAFold. The accuracy of SCUBA-D was confirmed by the X-ray structures of 16 designed proteins and a protein complex, and by experiments validating designed heme-binding proteins and Ras-binding proteins. Our work shows that deep generative models of images or texts can be fruitfully extended to complex physical objects like protein structures by addressing outstanding issues such as the data-out-of-distribution errors.
期刊介绍:
Nature Methods is a monthly journal that focuses on publishing innovative methods and substantial enhancements to fundamental life sciences research techniques. Geared towards a diverse, interdisciplinary readership of researchers in academia and industry engaged in laboratory work, the journal offers new tools for research and emphasizes the immediate practical significance of the featured work. It publishes primary research papers and reviews recent technical and methodological advancements, with a particular interest in primary methods papers relevant to the biological and biomedical sciences. This includes methods rooted in chemistry with practical applications for studying biological problems.