利用深度学习在给定骨架上设计蛋白质序列

Yufeng Liu, Haiyan Liu
{"title":"利用深度学习在给定骨架上设计蛋白质序列","authors":"Yufeng Liu, Haiyan Liu","doi":"10.1093/protein/gzad024","DOIUrl":null,"url":null,"abstract":"Deep learning methods for protein sequence design focus on modeling and sampling the many- dimensional distribution of amino acid sequences conditioned on the backbone structure. To produce physically foldable sequences, inter-residue couplings need to be considered properly. These couplings are treated explicitly in iterative methods or autoregressive methods. Non-autoregressive models treating these couplings implicitly are computationally more efficient, but still await tests by wet experiment. Currently, sequence design methods are evaluated mainly using native sequence recovery rate and native sequence perplexity. These metrics can be complemented by sequence-structure compatibility metrics obtained from energy calculation or structure prediction. However, existing computational metrics have important limitations that may render the generalization of computational test results to performance in real applications unwarranted. Validation of design methods by wet experiments should be encouraged.","PeriodicalId":20681,"journal":{"name":"Protein Engineering, Design and Selection","volume":"11 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Protein sequence design on given backbones with deep learning\",\"authors\":\"Yufeng Liu, Haiyan Liu\",\"doi\":\"10.1093/protein/gzad024\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep learning methods for protein sequence design focus on modeling and sampling the many- dimensional distribution of amino acid sequences conditioned on the backbone structure. To produce physically foldable sequences, inter-residue couplings need to be considered properly. These couplings are treated explicitly in iterative methods or autoregressive methods. Non-autoregressive models treating these couplings implicitly are computationally more efficient, but still await tests by wet experiment. Currently, sequence design methods are evaluated mainly using native sequence recovery rate and native sequence perplexity. These metrics can be complemented by sequence-structure compatibility metrics obtained from energy calculation or structure prediction. However, existing computational metrics have important limitations that may render the generalization of computational test results to performance in real applications unwarranted. Validation of design methods by wet experiments should be encouraged.\",\"PeriodicalId\":20681,\"journal\":{\"name\":\"Protein Engineering, Design and Selection\",\"volume\":\"11 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-12-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Protein Engineering, Design and Selection\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1093/protein/gzad024\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Protein Engineering, Design and Selection","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/protein/gzad024","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

用于蛋白质序列设计的深度学习方法侧重于以骨架结构为条件,对氨基酸序列的多维分布进行建模和采样。为了生成物理上可折叠的序列,需要适当考虑残基间的耦合。这些耦合在迭代法或自回归法中得到了明确的处理。隐含处理这些耦合的非自回归模型计算效率更高,但仍有待湿实验的检验。目前,序列设计方法主要使用原生序列恢复率和原生序列复杂度进行评估。通过能量计算或结构预测获得的序列-结构兼容性指标可以对这些指标进行补充。然而,现有的计算指标有很大的局限性,可能导致无法将计算测试结果推广到实际应用中。应鼓励通过湿实验验证设计方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Protein sequence design on given backbones with deep learning
Deep learning methods for protein sequence design focus on modeling and sampling the many- dimensional distribution of amino acid sequences conditioned on the backbone structure. To produce physically foldable sequences, inter-residue couplings need to be considered properly. These couplings are treated explicitly in iterative methods or autoregressive methods. Non-autoregressive models treating these couplings implicitly are computationally more efficient, but still await tests by wet experiment. Currently, sequence design methods are evaluated mainly using native sequence recovery rate and native sequence perplexity. These metrics can be complemented by sequence-structure compatibility metrics obtained from energy calculation or structure prediction. However, existing computational metrics have important limitations that may render the generalization of computational test results to performance in real applications unwarranted. Validation of design methods by wet experiments should be encouraged.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信