What is the probability of a chance prediction of a protein structure with an rmsd of 6 å?

Boris A Reva , Alexei V Finkelstein , Jeffrey Skolnick
{"title":"What is the probability of a chance prediction of a protein structure with an rmsd of 6 å?","authors":"Boris A Reva ,&nbsp;Alexei V Finkelstein ,&nbsp;Jeffrey Skolnick","doi":"10.1016/S1359-0278(98)00019-4","DOIUrl":null,"url":null,"abstract":"<div><p><strong>Background</strong>: The root mean square deviation (rmsd) between corresponding atoms of two protein chains is a commonly used measure of similarity between two protein structures. The smaller the rmsd is between two structures, the more similar are these two structures. In protein structure prediction, one needs the rmsd between predicted and experimental structures for which a prediction can be considered to be successful. Success is obvious only when the rmsd is as small as that for closely homologous proteins (&lt; 3 å). To estimate the quality of the prediction in the more general case, one has to compare the native structure not only with the predicted one but also with randomly chosen protein-like folds. One can ask: how many such structures must be considered to find a structure with a given rmsd from the native structure?</p><p><strong>Results</strong>: We calculated the rmsd values between native structures of 142 proteins and all compact structures obtained in the threading of these protein chains over 364 non-homologous structures. The rmsd distributions have a Gaussian form, with the average rmsd approximately proportional to the radius of gyration.</p><p><strong>Conclusions</strong>: We estimated the number of protein-like structures required to obtain a structure within an rmsd of 6 å to be 10<sup>4</sup>–10<sup>5</sup> for chains of 60–80 residues and 10<sup>11</sup>–10<sup>12</sup> structures for chains of 160–200 residues. The probability of obtaining a 6 å rmsd by chance is so remote that when such structures are obtained from a prediction algorithm, it should be considered quite successful.</p></div>","PeriodicalId":79488,"journal":{"name":"Folding & design","volume":"3 2","pages":"Pages 141-147"},"PeriodicalIF":0.0000,"publicationDate":"1998-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/S1359-0278(98)00019-4","citationCount":"179","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Folding & design","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1359027898000194","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 179

Abstract

Background: The root mean square deviation (rmsd) between corresponding atoms of two protein chains is a commonly used measure of similarity between two protein structures. The smaller the rmsd is between two structures, the more similar are these two structures. In protein structure prediction, one needs the rmsd between predicted and experimental structures for which a prediction can be considered to be successful. Success is obvious only when the rmsd is as small as that for closely homologous proteins (< 3 å). To estimate the quality of the prediction in the more general case, one has to compare the native structure not only with the predicted one but also with randomly chosen protein-like folds. One can ask: how many such structures must be considered to find a structure with a given rmsd from the native structure?

Results: We calculated the rmsd values between native structures of 142 proteins and all compact structures obtained in the threading of these protein chains over 364 non-homologous structures. The rmsd distributions have a Gaussian form, with the average rmsd approximately proportional to the radius of gyration.

Conclusions: We estimated the number of protein-like structures required to obtain a structure within an rmsd of 6 å to be 104–105 for chains of 60–80 residues and 1011–1012 structures for chains of 160–200 residues. The probability of obtaining a 6 å rmsd by chance is so remote that when such structures are obtained from a prediction algorithm, it should be considered quite successful.

rmsd为6的蛋白质结构的概率预测是多少?
背景:两条蛋白质链对应原子间的均方根偏差(rmsd)是衡量两种蛋白质结构相似性的常用方法。两个结构之间的均方根差越小,这两个结构越相似。在蛋白质结构预测中,我们需要预测结构和实验结构之间的rmsd,这样预测就可以被认为是成功的。只有当rmsd与同源蛋白(<3)。为了在更一般的情况下估计预测的质量,人们不仅要将天然结构与预测的结构进行比较,还要与随机选择的蛋白质样折叠进行比较。有人可能会问:必须考虑多少这样的结构才能从本地结构中找到具有给定rmsd的结构?结果:我们计算了142种蛋白质的天然结构与这些蛋白质链中所有紧密结构在364种非同源结构之间的rmsd值。rmsd分布具有高斯形式,平均rmsd与旋转半径近似成正比。结论:我们估计在rmsd为6的范围内获得60-80个残基链所需的蛋白质样结构的数量为104-105个,160-200个残基链所需的结构为1011-1012个。偶然获得6个 rmsd的概率是如此之小,以至于当从预测算法中获得这样的结构时,它应该被认为是相当成功的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信