SAXS Assistant: Automated SAXS Analysis for Structural Discovery in Biologics and Polymeric Nanoparticles.

IF 3.1 3区 生物学 Q2 BIOPHYSICS
Cesar Ramirez,Elena Di Mare,James Byrnes,Eman Ahmed,Maria Pineiro-Goncalves,Cristian Lopez,N Sanjeeva Murthy,Adam J Gormley
{"title":"SAXS Assistant: Automated SAXS Analysis for Structural Discovery in Biologics and Polymeric Nanoparticles.","authors":"Cesar Ramirez,Elena Di Mare,James Byrnes,Eman Ahmed,Maria Pineiro-Goncalves,Cristian Lopez,N Sanjeeva Murthy,Adam J Gormley","doi":"10.1016/j.bpj.2025.09.034","DOIUrl":null,"url":null,"abstract":"Small-angle X-ray scattering (SAXS) is a powerful technique for assessing macromolecular structure. High-throughput SAXS is limited by the time-consuming and, at times, subjective nature of SAXS data interpretation. We present SAXS Assistant, a Python-based script that streamlines SAXS data analysis to extract features for machine learning (ML) and key structural parameters, including the Guinier radius of gyration (Rg), pair distance distribution function (PDDF)-derived Rg, maximum particle dimension (Dmax), and Kratky plots. The script builds upon BioXTAS RAW, and validates reliability via Guinier/PDDF Rg agreement, an important indicator of well-measured datasets. For assistance in Dmax estimation, a multi-layer perceptron (MLP) regressor was trained with 1,940 data files from the small angle scattering biological data bank (SASBDB). The model achieved a test set performance R2 = 0.90 and mean absolute error (MAE) = 11.7 Å. Training exclusively with experimental data translates analyses from researchers, including experts in the field, to the ML model, which helps assess Dmax estimations from PDDF. Gaussian mixture model (GMM) clustering was implemented to classify profiles into structural classes based on entries in the SASBDB. Users may therefore assess the similarity between experimental samples and known biomolecular shapes within the mapped repository entries. This probabilistic clustering aids in quantifying information from Kratky and generating shape-descriptive features. SAXS Assistant accelerates SAXS data analysis through enforced quality control, ML-ready outputs, and flags for low-confidence results. In addition to providing the ability to analyze large datasets at high-throughput, this tool is versatile and may serve researchers in both biological and synthetic polymer research fields.","PeriodicalId":8922,"journal":{"name":"Biophysical journal","volume":"27 1","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2025-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biophysical journal","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1016/j.bpj.2025.09.034","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOPHYSICS","Score":null,"Total":0}
引用次数: 0

Abstract

Small-angle X-ray scattering (SAXS) is a powerful technique for assessing macromolecular structure. High-throughput SAXS is limited by the time-consuming and, at times, subjective nature of SAXS data interpretation. We present SAXS Assistant, a Python-based script that streamlines SAXS data analysis to extract features for machine learning (ML) and key structural parameters, including the Guinier radius of gyration (Rg), pair distance distribution function (PDDF)-derived Rg, maximum particle dimension (Dmax), and Kratky plots. The script builds upon BioXTAS RAW, and validates reliability via Guinier/PDDF Rg agreement, an important indicator of well-measured datasets. For assistance in Dmax estimation, a multi-layer perceptron (MLP) regressor was trained with 1,940 data files from the small angle scattering biological data bank (SASBDB). The model achieved a test set performance R2 = 0.90 and mean absolute error (MAE) = 11.7 Å. Training exclusively with experimental data translates analyses from researchers, including experts in the field, to the ML model, which helps assess Dmax estimations from PDDF. Gaussian mixture model (GMM) clustering was implemented to classify profiles into structural classes based on entries in the SASBDB. Users may therefore assess the similarity between experimental samples and known biomolecular shapes within the mapped repository entries. This probabilistic clustering aids in quantifying information from Kratky and generating shape-descriptive features. SAXS Assistant accelerates SAXS data analysis through enforced quality control, ML-ready outputs, and flags for low-confidence results. In addition to providing the ability to analyze large datasets at high-throughput, this tool is versatile and may serve researchers in both biological and synthetic polymer research fields.
SAXS助手:生物制剂和聚合物纳米颗粒结构发现的自动SAXS分析。
小角x射线散射(SAXS)是一种评价大分子结构的有力技术。高通量SAXS受到SAXS数据解释的耗时和有时主观性质的限制。我们提出了SAXS助手,这是一个基于python的脚本,它简化了SAXS数据分析,以提取用于机器学习(ML)和关键结构参数的特征,包括吉尼尔旋转半径(Rg),对距离分布函数(PDDF)衍生的Rg,最大粒子尺寸(Dmax)和Kratky图。该脚本建立在BioXTAS RAW的基础上,并通过Guinier/PDDF Rg协议验证可靠性,这是测量数据集的重要指标。为了帮助估计Dmax,使用来自小角度散射生物数据库(SASBDB)的1,940个数据文件训练多层感知器(MLP)回归器。模型的测试集性能R2 = 0.90,平均绝对误差(MAE) = 11.7 Å。专门使用实验数据进行培训,将研究人员(包括该领域的专家)的分析转化为ML模型,该模型有助于评估来自PDDF的Dmax估计。采用高斯混合模型(GMM)聚类方法,根据SASBDB中的条目对剖面进行结构分类。因此,用户可以评估实验样本与已知生物分子形状之间的相似性。这种概率聚类有助于量化来自Kratky的信息并生成形状描述特征。SAXS Assistant通过强制质量控制、ml就绪输出和标记低置信度结果来加速SAXS数据分析。除了提供高通量分析大型数据集的能力外,该工具还具有通用性,可以为生物和合成聚合物研究领域的研究人员提供服务。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Biophysical journal
Biophysical journal 生物-生物物理
CiteScore
6.10
自引率
5.90%
发文量
3090
审稿时长
2 months
期刊介绍: BJ publishes original articles, letters, and perspectives on important problems in modern biophysics. The papers should be written so as to be of interest to a broad community of biophysicists. BJ welcomes experimental studies that employ quantitative physical approaches for the study of biological systems, including or spanning scales from molecule to whole organism. Experimental studies of a purely descriptive or phenomenological nature, with no theoretical or mechanistic underpinning, are not appropriate for publication in BJ. Theoretical studies should offer new insights into the understanding ofexperimental results or suggest new experimentally testable hypotheses. Articles reporting significant methodological or technological advances, which have potential to open new areas of biophysical investigation, are also suitable for publication in BJ. Papers describing improvements in accuracy or speed of existing methods or extra detail within methods described previously are not suitable for BJ.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信