SAXS-A-FOLD: a website for fast ensemble modeling optimizing the fit of AlphaFold or user-supplied protein structures with flexible regions to SAXS data.

IF 6.1 3区 材料科学 Q1 Biochemistry, Genetics and Molecular Biology
Journal of Applied Crystallography Pub Date : 2025-05-29 eCollection Date: 2025-06-01 DOI:10.1107/S1600576725003590
Emre Brookes, Joseph E Curtis, Aaron Householder, Mattia Rocco
{"title":"<i>SAXS-A-FOLD</i>: a website for fast ensemble modeling optimizing the fit of <i>AlphaFold</i> or user-supplied protein structures with flexible regions to SAXS data.","authors":"Emre Brookes, Joseph E Curtis, Aaron Householder, Mattia Rocco","doi":"10.1107/S1600576725003590","DOIUrl":null,"url":null,"abstract":"<p><p>AI programs such as <i>AlphaFold</i> (<i>AF</i>) are having a major impact on structural biology. However, predicted unstructured regions, the arrangement of linker-connected domains and their conformational changes in response to environmental variables present challenges that are not easily dealt with on purely computational grounds. An approach that uses predicted (or solved) protein modules/domains linked by potentially unstructured regions and that generates ensembles of models optimized against small-angle X-ray scattering (SAXS) data has been recently described [Brookes <i>et al.</i> (2023). <i>J. Appl. Cryst.</i> <b>56</b>, 910-926]. Its implementation on a public-domain website, <i>SAXS-A-FOLD</i> (https://saxsafold.genapp.rocks), is presented here. User-supplied SAXS experimental intensity <i>I</i>(<i>q</i>) versus scattering vector magnitude <i>q</i> and the derived pair-wise distance distribution function <i>P</i>(<i>r</i>) versus <i>r</i> are first uploaded. An <i>AF</i> or user-supplied structure (currently only single chains without prosthetic groups) is then uploaded and displayed, and its SAXS <i>I</i>(<i>q</i>) and <i>P</i>(<i>r</i>) profiles are computed and compared with the experimental data. If uploaded from <i>AF</i>, the structure is color-coded by the associated confidence level: on this basis, the website automatically proposes potential flexible regions that can be user modified. For user-supplied structures, these regions have to be directly entered. A starting pool of typically 10-50 × 10<sup>3</sup> conformations is generated using a Monte Carlo method that samples backbone dihedral angles along the chosen segments of potential flexibility in the protein structures. The initial pool is reduced to obtain a tractable set of models, for which <i>P</i>(<i>r</i>) and <i>I</i>(<i>q</i>) are computed with fast established methods. A global fit is performed using non-negatively constrained least-squares (NNLS) versus original data. The <i>P</i>(<i>r</i>) and <i>I</i>(<i>q</i>) NNLS results are then displayed, showing both the reconstructed curves and the contributing model curves, with their percentage contributions. A <i>WAXSiS</i> (https://waxsis.uni-saarland.de) implementation is utilized to calculate an <i>I</i>(<i>q</i>) for each selected model. These sets can be enhanced by adding a user-defined number of models generated before and after each selected model in the original Monte Carlo pool, ensuring the inclusion of nearby models that might better fit the data. Finally, NNLS is used on the <i>WAXSiS</i>-generated <i>I</i>(<i>q</i>) set versus the original <i>I</i>(<i>q</i>) data, with the results displaying the contributing models and their <i>I</i>(<i>q</i>). Aside from being representative of contributing conformations, the models selected by <i>SAXS-A-FOLD</i> could constitute a set of starting structures for more advanced MD simulations.</p>","PeriodicalId":14950,"journal":{"name":"Journal of Applied Crystallography","volume":"58 Pt 3","pages":"1034-1049"},"PeriodicalIF":6.1000,"publicationDate":"2025-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12135990/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Applied Crystallography","FirstCategoryId":"88","ListUrlMain":"https://doi.org/10.1107/S1600576725003590","RegionNum":3,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/6/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"Biochemistry, Genetics and Molecular Biology","Score":null,"Total":0}
引用次数: 0

Abstract

AI programs such as AlphaFold (AF) are having a major impact on structural biology. However, predicted unstructured regions, the arrangement of linker-connected domains and their conformational changes in response to environmental variables present challenges that are not easily dealt with on purely computational grounds. An approach that uses predicted (or solved) protein modules/domains linked by potentially unstructured regions and that generates ensembles of models optimized against small-angle X-ray scattering (SAXS) data has been recently described [Brookes et al. (2023). J. Appl. Cryst. 56, 910-926]. Its implementation on a public-domain website, SAXS-A-FOLD (https://saxsafold.genapp.rocks), is presented here. User-supplied SAXS experimental intensity I(q) versus scattering vector magnitude q and the derived pair-wise distance distribution function P(r) versus r are first uploaded. An AF or user-supplied structure (currently only single chains without prosthetic groups) is then uploaded and displayed, and its SAXS I(q) and P(r) profiles are computed and compared with the experimental data. If uploaded from AF, the structure is color-coded by the associated confidence level: on this basis, the website automatically proposes potential flexible regions that can be user modified. For user-supplied structures, these regions have to be directly entered. A starting pool of typically 10-50 × 103 conformations is generated using a Monte Carlo method that samples backbone dihedral angles along the chosen segments of potential flexibility in the protein structures. The initial pool is reduced to obtain a tractable set of models, for which P(r) and I(q) are computed with fast established methods. A global fit is performed using non-negatively constrained least-squares (NNLS) versus original data. The P(r) and I(q) NNLS results are then displayed, showing both the reconstructed curves and the contributing model curves, with their percentage contributions. A WAXSiS (https://waxsis.uni-saarland.de) implementation is utilized to calculate an I(q) for each selected model. These sets can be enhanced by adding a user-defined number of models generated before and after each selected model in the original Monte Carlo pool, ensuring the inclusion of nearby models that might better fit the data. Finally, NNLS is used on the WAXSiS-generated I(q) set versus the original I(q) data, with the results displaying the contributing models and their I(q). Aside from being representative of contributing conformations, the models selected by SAXS-A-FOLD could constitute a set of starting structures for more advanced MD simulations.

SAXS- a - fold:用于快速集成建模的网站,优化AlphaFold或用户提供的具有灵活区域的蛋白质结构与SAXS数据的拟合。
像AlphaFold (AF)这样的人工智能程序正在对结构生物学产生重大影响。然而,预测的非结构化区域、连接域的排列以及它们的构象变化对环境变量的响应提出了挑战,这些挑战在纯粹的计算基础上不容易处理。最近已经描述了一种方法,该方法使用由潜在非结构化区域连接的预测(或解决)蛋白质模块/结构域,并生成针对小角度x射线散射(SAXS)数据优化的模型集成[Brookes等人](2023)。j:。[j].中国生物医学工程学报,2016,33(2):391 - 391。这里介绍了它在公共领域网站SAXS-A-FOLD (https://saxsafold.genapp.rocks)上的实现。首先上传用户提供的SAXS实验强度I(q)与散射矢量大小q的关系,以及导出的成对距离分布函数P(r)与r的关系。然后上传和显示AF或用户提供的结构(目前只有单链没有假基),并计算其SAXS I(q)和P(r)曲线并与实验数据进行比较。如果从AF上传,则根据相关置信度对结构进行颜色编码:在此基础上,网站自动提出用户可以修改的潜在灵活区域。对于用户提供的结构,必须直接输入这些区域。一个典型的10-50 × 103个构象的起始池是使用蒙特卡罗方法生成的,该方法沿着蛋白质结构中潜在柔韧性的选定片段采样主二面角。将初始池简化为一组可处理的模型,其中P(r)和I(q)用快速建立的方法计算。使用非负约束最小二乘(NNLS)与原始数据进行全局拟合。然后显示P(r)和I(q) NNLS结果,显示重建曲线和贡献模型曲线,以及它们的百分比贡献。使用WAXSiS (https://waxsis.uni-saarland.de)实现计算每个选定模型的I(q)。可以通过在原始Monte Carlo池中每个选定模型之前和之后添加用户定义的模型数量来增强这些集,从而确保包含可能更适合数据的附近模型。最后,对waxsis生成的I(q)集和原始I(q)数据使用NNLS,结果显示贡献模型及其I(q)。SAXS-A-FOLD选择的模型除了具有贡献构象的代表性外,还可以为更高级的MD模拟提供一组起始结构。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
10.00
自引率
3.30%
发文量
178
审稿时长
4.7 months
期刊介绍: Many research topics in condensed matter research, materials science and the life sciences make use of crystallographic methods to study crystalline and non-crystalline matter with neutrons, X-rays and electrons. Articles published in the Journal of Applied Crystallography focus on these methods and their use in identifying structural and diffusion-controlled phase transformations, structure-property relationships, structural changes of defects, interfaces and surfaces, etc. Developments of instrumentation and crystallographic apparatus, theory and interpretation, numerical analysis and other related subjects are also covered. The journal is the primary place where crystallographic computer program information is published.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信