用卷积神经网络学习蛋白质结构的局部景观

IF 1.8 4区 生物学 Q3 BIOPHYSICS
Anastasiya V. Kulikova, Daniel J. Diaz, James M. Loy, Andrew D. Ellington, Claus O. Wilke
{"title":"用卷积神经网络学习蛋白质结构的局部景观","authors":"Anastasiya V. Kulikova,&nbsp;Daniel J. Diaz,&nbsp;James M. Loy,&nbsp;Andrew D. Ellington,&nbsp;Claus O. Wilke","doi":"10.1007/s10867-021-09593-6","DOIUrl":null,"url":null,"abstract":"<div><p>One fundamental problem of protein biochemistry is to predict protein structure from amino acid sequence. The inverse problem, predicting either entire sequences or individual mutations that are consistent with a given protein structure, has received much less attention even though it has important applications in both protein engineering and evolutionary biology. Here, we ask whether 3D convolutional neural networks (3D CNNs) can learn the local fitness landscape of protein structure to reliably predict either the wild-type amino acid or the consensus in a multiple sequence alignment from the local structural context surrounding site of interest. We find that the network can predict wild type with good accuracy, and that network confidence is a reliable measure of whether a given prediction is likely going to be correct or not. Predictions of consensus are less accurate and are primarily driven by whether or not the consensus matches the wild type. Our work suggests that high-confidence mis-predictions of the wild type may identify sites that are primed for mutation and likely targets for protein engineering.</p></div>","PeriodicalId":612,"journal":{"name":"Journal of Biological Physics","volume":null,"pages":null},"PeriodicalIF":1.8000,"publicationDate":"2021-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10867-021-09593-6.pdf","citationCount":"5","resultStr":"{\"title\":\"Learning the local landscape of protein structures with convolutional neural networks\",\"authors\":\"Anastasiya V. Kulikova,&nbsp;Daniel J. Diaz,&nbsp;James M. Loy,&nbsp;Andrew D. Ellington,&nbsp;Claus O. Wilke\",\"doi\":\"10.1007/s10867-021-09593-6\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>One fundamental problem of protein biochemistry is to predict protein structure from amino acid sequence. The inverse problem, predicting either entire sequences or individual mutations that are consistent with a given protein structure, has received much less attention even though it has important applications in both protein engineering and evolutionary biology. Here, we ask whether 3D convolutional neural networks (3D CNNs) can learn the local fitness landscape of protein structure to reliably predict either the wild-type amino acid or the consensus in a multiple sequence alignment from the local structural context surrounding site of interest. We find that the network can predict wild type with good accuracy, and that network confidence is a reliable measure of whether a given prediction is likely going to be correct or not. Predictions of consensus are less accurate and are primarily driven by whether or not the consensus matches the wild type. Our work suggests that high-confidence mis-predictions of the wild type may identify sites that are primed for mutation and likely targets for protein engineering.</p></div>\",\"PeriodicalId\":612,\"journal\":{\"name\":\"Journal of Biological Physics\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":1.8000,\"publicationDate\":\"2021-11-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://link.springer.com/content/pdf/10.1007/s10867-021-09593-6.pdf\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Biological Physics\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s10867-021-09593-6\",\"RegionNum\":4,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"BIOPHYSICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Biological Physics","FirstCategoryId":"99","ListUrlMain":"https://link.springer.com/article/10.1007/s10867-021-09593-6","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"BIOPHYSICS","Score":null,"Total":0}
引用次数: 5

摘要

从氨基酸序列预测蛋白质结构是蛋白质生物化学的一个基本问题。相反的问题,即预测与给定蛋白质结构一致的整个序列或个体突变,尽管在蛋白质工程和进化生物学中都有重要的应用,但却很少受到关注。在这里,我们询问3D卷积神经网络(3D cnn)是否可以学习蛋白质结构的局部适应度景观,从而可靠地预测野生型氨基酸或从感兴趣位点周围的局部结构背景中对多序列比对的共识。我们发现网络可以很准确地预测野生型,网络置信度是一个可靠的衡量给定的预测是否可能是正确的。共识的预测不太准确,主要是由共识是否与野性类型相匹配驱动的。我们的工作表明,对野生型的高可信度错误预测可能会识别出突变的起始位点和蛋白质工程的可能靶标。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Learning the local landscape of protein structures with convolutional neural networks

One fundamental problem of protein biochemistry is to predict protein structure from amino acid sequence. The inverse problem, predicting either entire sequences or individual mutations that are consistent with a given protein structure, has received much less attention even though it has important applications in both protein engineering and evolutionary biology. Here, we ask whether 3D convolutional neural networks (3D CNNs) can learn the local fitness landscape of protein structure to reliably predict either the wild-type amino acid or the consensus in a multiple sequence alignment from the local structural context surrounding site of interest. We find that the network can predict wild type with good accuracy, and that network confidence is a reliable measure of whether a given prediction is likely going to be correct or not. Predictions of consensus are less accurate and are primarily driven by whether or not the consensus matches the wild type. Our work suggests that high-confidence mis-predictions of the wild type may identify sites that are primed for mutation and likely targets for protein engineering.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Biological Physics
Journal of Biological Physics 生物-生物物理
CiteScore
3.00
自引率
5.60%
发文量
20
审稿时长
>12 weeks
期刊介绍: Many physicists are turning their attention to domains that were not traditionally part of physics and are applying the sophisticated tools of theoretical, computational and experimental physics to investigate biological processes, systems and materials. The Journal of Biological Physics provides a medium where this growing community of scientists can publish its results and discuss its aims and methods. It welcomes papers which use the tools of physics in an innovative way to study biological problems, as well as research aimed at providing a better understanding of the physical principles underlying biological processes.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信