Protein sequence design on given backbones with deep learning

Protein Engineering, Design and Selection Pub Date : 2023-12-29 DOI:10.1093/protein/gzad024

Yufeng Liu, Haiyan Liu

引用次数: 0

Abstract

Deep learning methods for protein sequence design focus on modeling and sampling the many- dimensional distribution of amino acid sequences conditioned on the backbone structure. To produce physically foldable sequences, inter-residue couplings need to be considered properly. These couplings are treated explicitly in iterative methods or autoregressive methods. Non-autoregressive models treating these couplings implicitly are computationally more efficient, but still await tests by wet experiment. Currently, sequence design methods are evaluated mainly using native sequence recovery rate and native sequence perplexity. These metrics can be complemented by sequence-structure compatibility metrics obtained from energy calculation or structure prediction. However, existing computational metrics have important limitations that may render the generalization of computational test results to performance in real applications unwarranted. Validation of design methods by wet experiments should be encouraged.

查看原文本刊更多论文

利用深度学习在给定骨架上设计蛋白质序列

用于蛋白质序列设计的深度学习方法侧重于以骨架结构为条件，对氨基酸序列的多维分布进行建模和采样。为了生成物理上可折叠的序列，需要适当考虑残基间的耦合。这些耦合在迭代法或自回归法中得到了明确的处理。隐含处理这些耦合的非自回归模型计算效率更高，但仍有待湿实验的检验。目前，序列设计方法主要使用原生序列恢复率和原生序列复杂度进行评估。通过能量计算或结构预测获得的序列-结构兼容性指标可以对这些指标进行补充。然而，现有的计算指标有很大的局限性，可能导致无法将计算测试结果推广到实际应用中。应鼓励通过湿实验验证设计方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Protein Engineering, Design and Selection

自引率

0.00%

发文量