Cesar Ramirez,Elena Di Mare,James Byrnes,Eman Ahmed,Maria Pineiro-Goncalves,Cristian Lopez,N Sanjeeva Murthy,Adam J Gormley
{"title":"SAXS助手:生物制剂和聚合物纳米颗粒结构发现的自动SAXS分析。","authors":"Cesar Ramirez,Elena Di Mare,James Byrnes,Eman Ahmed,Maria Pineiro-Goncalves,Cristian Lopez,N Sanjeeva Murthy,Adam J Gormley","doi":"10.1016/j.bpj.2025.09.034","DOIUrl":null,"url":null,"abstract":"Small-angle X-ray scattering (SAXS) is a powerful technique for assessing macromolecular structure. High-throughput SAXS is limited by the time-consuming and, at times, subjective nature of SAXS data interpretation. We present SAXS Assistant, a Python-based script that streamlines SAXS data analysis to extract features for machine learning (ML) and key structural parameters, including the Guinier radius of gyration (Rg), pair distance distribution function (PDDF)-derived Rg, maximum particle dimension (Dmax), and Kratky plots. The script builds upon BioXTAS RAW, and validates reliability via Guinier/PDDF Rg agreement, an important indicator of well-measured datasets. For assistance in Dmax estimation, a multi-layer perceptron (MLP) regressor was trained with 1,940 data files from the small angle scattering biological data bank (SASBDB). The model achieved a test set performance R2 = 0.90 and mean absolute error (MAE) = 11.7 Å. Training exclusively with experimental data translates analyses from researchers, including experts in the field, to the ML model, which helps assess Dmax estimations from PDDF. Gaussian mixture model (GMM) clustering was implemented to classify profiles into structural classes based on entries in the SASBDB. Users may therefore assess the similarity between experimental samples and known biomolecular shapes within the mapped repository entries. This probabilistic clustering aids in quantifying information from Kratky and generating shape-descriptive features. SAXS Assistant accelerates SAXS data analysis through enforced quality control, ML-ready outputs, and flags for low-confidence results. In addition to providing the ability to analyze large datasets at high-throughput, this tool is versatile and may serve researchers in both biological and synthetic polymer research fields.","PeriodicalId":8922,"journal":{"name":"Biophysical journal","volume":"27 1","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2025-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SAXS Assistant: Automated SAXS Analysis for Structural Discovery in Biologics and Polymeric Nanoparticles.\",\"authors\":\"Cesar Ramirez,Elena Di Mare,James Byrnes,Eman Ahmed,Maria Pineiro-Goncalves,Cristian Lopez,N Sanjeeva Murthy,Adam J Gormley\",\"doi\":\"10.1016/j.bpj.2025.09.034\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Small-angle X-ray scattering (SAXS) is a powerful technique for assessing macromolecular structure. High-throughput SAXS is limited by the time-consuming and, at times, subjective nature of SAXS data interpretation. We present SAXS Assistant, a Python-based script that streamlines SAXS data analysis to extract features for machine learning (ML) and key structural parameters, including the Guinier radius of gyration (Rg), pair distance distribution function (PDDF)-derived Rg, maximum particle dimension (Dmax), and Kratky plots. The script builds upon BioXTAS RAW, and validates reliability via Guinier/PDDF Rg agreement, an important indicator of well-measured datasets. For assistance in Dmax estimation, a multi-layer perceptron (MLP) regressor was trained with 1,940 data files from the small angle scattering biological data bank (SASBDB). The model achieved a test set performance R2 = 0.90 and mean absolute error (MAE) = 11.7 Å. Training exclusively with experimental data translates analyses from researchers, including experts in the field, to the ML model, which helps assess Dmax estimations from PDDF. Gaussian mixture model (GMM) clustering was implemented to classify profiles into structural classes based on entries in the SASBDB. Users may therefore assess the similarity between experimental samples and known biomolecular shapes within the mapped repository entries. This probabilistic clustering aids in quantifying information from Kratky and generating shape-descriptive features. SAXS Assistant accelerates SAXS data analysis through enforced quality control, ML-ready outputs, and flags for low-confidence results. In addition to providing the ability to analyze large datasets at high-throughput, this tool is versatile and may serve researchers in both biological and synthetic polymer research fields.\",\"PeriodicalId\":8922,\"journal\":{\"name\":\"Biophysical journal\",\"volume\":\"27 1\",\"pages\":\"\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2025-09-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Biophysical journal\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1016/j.bpj.2025.09.034\",\"RegionNum\":3,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"BIOPHYSICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biophysical journal","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1016/j.bpj.2025.09.034","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOPHYSICS","Score":null,"Total":0}
SAXS Assistant: Automated SAXS Analysis for Structural Discovery in Biologics and Polymeric Nanoparticles.
Small-angle X-ray scattering (SAXS) is a powerful technique for assessing macromolecular structure. High-throughput SAXS is limited by the time-consuming and, at times, subjective nature of SAXS data interpretation. We present SAXS Assistant, a Python-based script that streamlines SAXS data analysis to extract features for machine learning (ML) and key structural parameters, including the Guinier radius of gyration (Rg), pair distance distribution function (PDDF)-derived Rg, maximum particle dimension (Dmax), and Kratky plots. The script builds upon BioXTAS RAW, and validates reliability via Guinier/PDDF Rg agreement, an important indicator of well-measured datasets. For assistance in Dmax estimation, a multi-layer perceptron (MLP) regressor was trained with 1,940 data files from the small angle scattering biological data bank (SASBDB). The model achieved a test set performance R2 = 0.90 and mean absolute error (MAE) = 11.7 Å. Training exclusively with experimental data translates analyses from researchers, including experts in the field, to the ML model, which helps assess Dmax estimations from PDDF. Gaussian mixture model (GMM) clustering was implemented to classify profiles into structural classes based on entries in the SASBDB. Users may therefore assess the similarity between experimental samples and known biomolecular shapes within the mapped repository entries. This probabilistic clustering aids in quantifying information from Kratky and generating shape-descriptive features. SAXS Assistant accelerates SAXS data analysis through enforced quality control, ML-ready outputs, and flags for low-confidence results. In addition to providing the ability to analyze large datasets at high-throughput, this tool is versatile and may serve researchers in both biological and synthetic polymer research fields.
期刊介绍:
BJ publishes original articles, letters, and perspectives on important problems in modern biophysics. The papers should be written so as to be of interest to a broad community of biophysicists. BJ welcomes experimental studies that employ quantitative physical approaches for the study of biological systems, including or spanning scales from molecule to whole organism. Experimental studies of a purely descriptive or phenomenological nature, with no theoretical or mechanistic underpinning, are not appropriate for publication in BJ. Theoretical studies should offer new insights into the understanding ofexperimental results or suggest new experimentally testable hypotheses. Articles reporting significant methodological or technological advances, which have potential to open new areas of biophysical investigation, are also suitable for publication in BJ. Papers describing improvements in accuracy or speed of existing methods or extra detail within methods described previously are not suitable for BJ.