利用深度学习在给定骨架上设计蛋白质序列

Protein Engineering, Design and Selection Pub Date : 2023-12-29 DOI:10.1093/protein/gzad024

Yufeng Liu, Haiyan Liu

{"title":"利用深度学习在给定骨架上设计蛋白质序列","authors":"Yufeng Liu, Haiyan Liu","doi":"10.1093/protein/gzad024","DOIUrl":null,"url":null,"abstract":"Deep learning methods for protein sequence design focus on modeling and sampling the many- dimensional distribution of amino acid sequences conditioned on the backbone structure. To produce physically foldable sequences, inter-residue couplings need to be considered properly. These couplings are treated explicitly in iterative methods or autoregressive methods. Non-autoregressive models treating these couplings implicitly are computationally more efficient, but still await tests by wet experiment. Currently, sequence design methods are evaluated mainly using native sequence recovery rate and native sequence perplexity. These metrics can be complemented by sequence-structure compatibility metrics obtained from energy calculation or structure prediction. However, existing computational metrics have important limitations that may render the generalization of computational test results to performance in real applications unwarranted. Validation of design methods by wet experiments should be encouraged.","PeriodicalId":20681,"journal":{"name":"Protein Engineering, Design and Selection","volume":"11 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Protein sequence design on given backbones with deep learning\",\"authors\":\"Yufeng Liu, Haiyan Liu\",\"doi\":\"10.1093/protein/gzad024\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep learning methods for protein sequence design focus on modeling and sampling the many- dimensional distribution of amino acid sequences conditioned on the backbone structure. To produce physically foldable sequences, inter-residue couplings need to be considered properly. These couplings are treated explicitly in iterative methods or autoregressive methods. Non-autoregressive models treating these couplings implicitly are computationally more efficient, but still await tests by wet experiment. Currently, sequence design methods are evaluated mainly using native sequence recovery rate and native sequence perplexity. These metrics can be complemented by sequence-structure compatibility metrics obtained from energy calculation or structure prediction. However, existing computational metrics have important limitations that may render the generalization of computational test results to performance in real applications unwarranted. Validation of design methods by wet experiments should be encouraged.\",\"PeriodicalId\":20681,\"journal\":{\"name\":\"Protein Engineering, Design and Selection\",\"volume\":\"11 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-12-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Protein Engineering, Design and Selection\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1093/protein/gzad024\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Protein Engineering, Design and Selection","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/protein/gzad024","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

用于蛋白质序列设计的深度学习方法侧重于以骨架结构为条件，对氨基酸序列的多维分布进行建模和采样。为了生成物理上可折叠的序列，需要适当考虑残基间的耦合。这些耦合在迭代法或自回归法中得到了明确的处理。隐含处理这些耦合的非自回归模型计算效率更高，但仍有待湿实验的检验。目前，序列设计方法主要使用原生序列恢复率和原生序列复杂度进行评估。通过能量计算或结构预测获得的序列-结构兼容性指标可以对这些指标进行补充。然而，现有的计算指标有很大的局限性，可能导致无法将计算测试结果推广到实际应用中。应鼓励通过湿实验验证设计方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Protein sequence design on given backbones with deep learning

Deep learning methods for protein sequence design focus on modeling and sampling the many- dimensional distribution of amino acid sequences conditioned on the backbone structure. To produce physically foldable sequences, inter-residue couplings need to be considered properly. These couplings are treated explicitly in iterative methods or autoregressive methods. Non-autoregressive models treating these couplings implicitly are computationally more efficient, but still await tests by wet experiment. Currently, sequence design methods are evaluated mainly using native sequence recovery rate and native sequence perplexity. These metrics can be complemented by sequence-structure compatibility metrics obtained from energy calculation or structure prediction. However, existing computational metrics have important limitations that may render the generalization of computational test results to performance in real applications unwarranted. Validation of design methods by wet experiments should be encouraged.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Protein Engineering, Design and Selection

自引率

0.00%

发文量