{"title":"利用最大似然法共识数据对蛋白质原子坐标的小角 X 射线散射预测方法进行基准测试。","authors":"","doi":"10.1107/S205225252400486X","DOIUrl":null,"url":null,"abstract":"<div><p>Consensus small-angle X-ray scattering (SAXS) data from five proteins in solution, generated from 171 independent measurements on 12 beamlines using a maximum likelihood method, are used to benchmark computational methods for predicting SAXS profiles from atomic coordinates. The results reveal important strengths and limitations of different methods that are serving a growing community of users in applications ranging from fundamental integrative structural biology to drug discovery and development.</p></div><div><p>Stimulated by informal conversations at the XVII International Small Angle Scattering (SAS) conference (Traverse City, 2017), an international team of experts undertook a round-robin exercise to produce a large dataset from proteins under standard solution conditions. These data were used to generate consensus SAS profiles for xylose isomerase, urate oxidase, xylanase, lysozyme and ribonuclease A. Here, we apply a new protocol using maximum likelihood with a larger number of the contributed datasets to generate improved consensus profiles. We investigate the fits of these profiles to predicted profiles from atomic coordinates that incorporate different models to account for the contribution to the scattering of water molecules of hydration surrounding proteins in solution. Programs using an implicit, shell-type hydration layer generally optimize fits to experimental data with the aid of two parameters that adjust the volume of the bulk solvent excluded by the protein and the contrast of the hydration layer. For these models, we found the error-weighted residual differences between the model and the experiment generally reflected the subsidiary maxima and minima in the consensus profiles that are determined by the size of the protein plus the hydration layer. By comparison, all-atom solute and solvent molecular dynamics (MD) simulations are without the benefit of adjustable parameters and, nonetheless, they yielded at least equally good fits with residual differences that are less reflective of the structure in the consensus profile. Further, where MD simulations accounted for the precise solvent composition of the experiment, specifically the inclusion of ions, the modelled radius of gyration values were significantly closer to the experiment. The power of adjustable parameters to mask real differences between a model and the structure present in solution is demonstrated by the results for the conformationally dynamic ribonuclease A and calculations with pseudo-experimental data. This study shows that, while methods invoking an implicit hydration layer have the unequivocal advantage of speed, care is needed to understand the influence of the adjustable parameters. All-atom solute and solvent MD simulations are slower but are less susceptible to false positives, and can account for thermal fluctuations in atomic positions, and more accurately represent the water molecules of hydration that contribute to the scattering profile.</p></div>","PeriodicalId":14775,"journal":{"name":"IUCrJ","volume":"11 5","pages":"Pages 762-779"},"PeriodicalIF":2.9000,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11364021/pdf/","citationCount":"0","resultStr":"{\"title\":\"Benchmarking predictive methods for small-angle X-ray scattering from atomic coordinates of proteins using maximum likelihood consensus data\",\"authors\":\"\",\"doi\":\"10.1107/S205225252400486X\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Consensus small-angle X-ray scattering (SAXS) data from five proteins in solution, generated from 171 independent measurements on 12 beamlines using a maximum likelihood method, are used to benchmark computational methods for predicting SAXS profiles from atomic coordinates. The results reveal important strengths and limitations of different methods that are serving a growing community of users in applications ranging from fundamental integrative structural biology to drug discovery and development.</p></div><div><p>Stimulated by informal conversations at the XVII International Small Angle Scattering (SAS) conference (Traverse City, 2017), an international team of experts undertook a round-robin exercise to produce a large dataset from proteins under standard solution conditions. These data were used to generate consensus SAS profiles for xylose isomerase, urate oxidase, xylanase, lysozyme and ribonuclease A. Here, we apply a new protocol using maximum likelihood with a larger number of the contributed datasets to generate improved consensus profiles. We investigate the fits of these profiles to predicted profiles from atomic coordinates that incorporate different models to account for the contribution to the scattering of water molecules of hydration surrounding proteins in solution. Programs using an implicit, shell-type hydration layer generally optimize fits to experimental data with the aid of two parameters that adjust the volume of the bulk solvent excluded by the protein and the contrast of the hydration layer. For these models, we found the error-weighted residual differences between the model and the experiment generally reflected the subsidiary maxima and minima in the consensus profiles that are determined by the size of the protein plus the hydration layer. By comparison, all-atom solute and solvent molecular dynamics (MD) simulations are without the benefit of adjustable parameters and, nonetheless, they yielded at least equally good fits with residual differences that are less reflective of the structure in the consensus profile. Further, where MD simulations accounted for the precise solvent composition of the experiment, specifically the inclusion of ions, the modelled radius of gyration values were significantly closer to the experiment. The power of adjustable parameters to mask real differences between a model and the structure present in solution is demonstrated by the results for the conformationally dynamic ribonuclease A and calculations with pseudo-experimental data. This study shows that, while methods invoking an implicit hydration layer have the unequivocal advantage of speed, care is needed to understand the influence of the adjustable parameters. All-atom solute and solvent MD simulations are slower but are less susceptible to false positives, and can account for thermal fluctuations in atomic positions, and more accurately represent the water molecules of hydration that contribute to the scattering profile.</p></div>\",\"PeriodicalId\":14775,\"journal\":{\"name\":\"IUCrJ\",\"volume\":\"11 5\",\"pages\":\"Pages 762-779\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2024-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11364021/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IUCrJ\",\"FirstCategoryId\":\"88\",\"ListUrlMain\":\"https://www.sciencedirect.com/org/science/article/pii/S2052252524000861\",\"RegionNum\":2,\"RegionCategory\":\"材料科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"CHEMISTRY, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IUCrJ","FirstCategoryId":"88","ListUrlMain":"https://www.sciencedirect.com/org/science/article/pii/S2052252524000861","RegionNum":2,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
摘要
在第十七届国际小角散射(SAS)会议(2017 年,特拉弗斯城)非正式对话的激励下,一个国际专家团队进行了一次循环练习,以生成标准溶液条件下蛋白质的大型数据集。这些数据被用于生成木糖异构酶、尿酸氧化酶、木聚糖酶、溶菌酶和核糖核酸酶 A 的共识 SAS 图谱。在此,我们使用最大似然法对更多的贡献数据集采用新的协议,以生成改进的共识图谱。我们研究了这些轮廓与原子坐标预测轮廓的拟合情况,这些原子坐标结合了不同的模型来解释溶液中蛋白质周围水合水分子散射的贡献。使用隐式壳型水合层的程序通常借助两个参数来优化与实验数据的拟合,这两个参数分别用于调整蛋白质排除的大量溶剂的体积和水合层的对比度。对于这些模型,我们发现模型与实验之间的误差加权残差通常反映了共识剖面中的附属最大值和最小值,而这是由蛋白质和水合层的大小决定的。相比之下,全原子溶质和溶剂分子动力学(MD)模拟没有可调参数的优势,尽管如此,它们至少产生了同样好的拟合效果,而残差则较少反映共识剖面中的结构。此外,当 MD 模拟考虑到实验中精确的溶剂成分,特别是加入离子时,模拟的回转半径值明显更接近实验结果。通过对构象动态核糖核酸酶 A 和伪实验数据的计算结果,证明了可调参数能够掩盖模型与溶液中存在的结构之间的实际差异。这项研究表明,虽然引用隐式水合层的方法具有速度上的明显优势,但仍需注意了解可调参数的影响。全原子溶质和溶剂 MD 模拟速度较慢,但不易出现假阳性结果,而且可以考虑原子位置的热波动,并能更准确地表示有助于散射曲线的水合水分子。
Benchmarking predictive methods for small-angle X-ray scattering from atomic coordinates of proteins using maximum likelihood consensus data
Consensus small-angle X-ray scattering (SAXS) data from five proteins in solution, generated from 171 independent measurements on 12 beamlines using a maximum likelihood method, are used to benchmark computational methods for predicting SAXS profiles from atomic coordinates. The results reveal important strengths and limitations of different methods that are serving a growing community of users in applications ranging from fundamental integrative structural biology to drug discovery and development.
Stimulated by informal conversations at the XVII International Small Angle Scattering (SAS) conference (Traverse City, 2017), an international team of experts undertook a round-robin exercise to produce a large dataset from proteins under standard solution conditions. These data were used to generate consensus SAS profiles for xylose isomerase, urate oxidase, xylanase, lysozyme and ribonuclease A. Here, we apply a new protocol using maximum likelihood with a larger number of the contributed datasets to generate improved consensus profiles. We investigate the fits of these profiles to predicted profiles from atomic coordinates that incorporate different models to account for the contribution to the scattering of water molecules of hydration surrounding proteins in solution. Programs using an implicit, shell-type hydration layer generally optimize fits to experimental data with the aid of two parameters that adjust the volume of the bulk solvent excluded by the protein and the contrast of the hydration layer. For these models, we found the error-weighted residual differences between the model and the experiment generally reflected the subsidiary maxima and minima in the consensus profiles that are determined by the size of the protein plus the hydration layer. By comparison, all-atom solute and solvent molecular dynamics (MD) simulations are without the benefit of adjustable parameters and, nonetheless, they yielded at least equally good fits with residual differences that are less reflective of the structure in the consensus profile. Further, where MD simulations accounted for the precise solvent composition of the experiment, specifically the inclusion of ions, the modelled radius of gyration values were significantly closer to the experiment. The power of adjustable parameters to mask real differences between a model and the structure present in solution is demonstrated by the results for the conformationally dynamic ribonuclease A and calculations with pseudo-experimental data. This study shows that, while methods invoking an implicit hydration layer have the unequivocal advantage of speed, care is needed to understand the influence of the adjustable parameters. All-atom solute and solvent MD simulations are slower but are less susceptible to false positives, and can account for thermal fluctuations in atomic positions, and more accurately represent the water molecules of hydration that contribute to the scattering profile.
期刊介绍:
IUCrJ is a new fully open-access peer-reviewed journal from the International Union of Crystallography (IUCr).
The journal will publish high-profile articles on all aspects of the sciences and technologies supported by the IUCr via its commissions, including emerging fields where structural results underpin the science reported in the article. Our aim is to make IUCrJ the natural home for high-quality structural science results. Chemists, biologists, physicists and material scientists will be actively encouraged to report their structural studies in IUCrJ.