David W Mittan-Moreau, Vanessa Oklejas, Daniel W Paley, Asmit Bhowmick, Romie C Nguyen, Aimin Liu, Jan Kern, Nicholas K Sauter, Aaron S Brewster
{"title":"Robust error calibration for serial crystallography.","authors":"David W Mittan-Moreau, Vanessa Oklejas, Daniel W Paley, Asmit Bhowmick, Romie C Nguyen, Aimin Liu, Jan Kern, Nicholas K Sauter, Aaron S Brewster","doi":"10.1107/S2059798325002852","DOIUrl":null,"url":null,"abstract":"<p><p>Serial crystallography is an important technique with unique abilities to resolve enzymatic transition states, minimize radiation damage to sensitive metalloenzymes and perform de novo structure determination from micrometre-sized crystals. This technique requires the merging of data from thousands of crystals, making manual identification of errant crystals unfeasible. cctbx.xfel.merge uses filtering to remove problematic data. However, this process is imperfect, and data reduction must be robust to outliers. We add robustness to cctbx.xfel.merge at the step of uncertainty determination for reflection intensities. This step is a critical point for robustness because it is the first step where the data sets are considered as a whole, as opposed to individual lattices. Robustness is conferred by reformulating the error-calibration procedure to have fewer and less stringent statistical assumptions and incorporating the ability to down-weight low-quality lattices. We then apply this method to five macromolecular XFEL data sets and observe the improvements to each. The appropriateness of the intensity uncertainties is demonstrated through internal consistency. This is performed through theoretical CC<sub>1/2</sub> and I/σ relationships and by weighted second moments, which use Wilson's prior to connect intensity uncertainties with their expected distribution. This work presents new mathematical tools to analyze intensity statistics and demonstrates their effectiveness through the often underappreciated process of uncertainty analysis.</p>","PeriodicalId":7116,"journal":{"name":"Acta Crystallographica. Section D, Structural Biology","volume":"81 Pt 5","pages":"265-275"},"PeriodicalIF":2.6000,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12054365/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Acta Crystallographica. Section D, Structural Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1107/S2059798325002852","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/4/29 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Serial crystallography is an important technique with unique abilities to resolve enzymatic transition states, minimize radiation damage to sensitive metalloenzymes and perform de novo structure determination from micrometre-sized crystals. This technique requires the merging of data from thousands of crystals, making manual identification of errant crystals unfeasible. cctbx.xfel.merge uses filtering to remove problematic data. However, this process is imperfect, and data reduction must be robust to outliers. We add robustness to cctbx.xfel.merge at the step of uncertainty determination for reflection intensities. This step is a critical point for robustness because it is the first step where the data sets are considered as a whole, as opposed to individual lattices. Robustness is conferred by reformulating the error-calibration procedure to have fewer and less stringent statistical assumptions and incorporating the ability to down-weight low-quality lattices. We then apply this method to five macromolecular XFEL data sets and observe the improvements to each. The appropriateness of the intensity uncertainties is demonstrated through internal consistency. This is performed through theoretical CC1/2 and I/σ relationships and by weighted second moments, which use Wilson's prior to connect intensity uncertainties with their expected distribution. This work presents new mathematical tools to analyze intensity statistics and demonstrates their effectiveness through the often underappreciated process of uncertainty analysis.
期刊介绍:
Acta Crystallographica Section D welcomes the submission of articles covering any aspect of structural biology, with a particular emphasis on the structures of biological macromolecules or the methods used to determine them.
Reports on new structures of biological importance may address the smallest macromolecules to the largest complex molecular machines. These structures may have been determined using any structural biology technique including crystallography, NMR, cryoEM and/or other techniques. The key criterion is that such articles must present significant new insights into biological, chemical or medical sciences. The inclusion of complementary data that support the conclusions drawn from the structural studies (such as binding studies, mass spectrometry, enzyme assays, or analysis of mutants or other modified forms of biological macromolecule) is encouraged.
Methods articles may include new approaches to any aspect of biological structure determination or structure analysis but will only be accepted where they focus on new methods that are demonstrated to be of general applicability and importance to structural biology. Articles describing particularly difficult problems in structural biology are also welcomed, if the analysis would provide useful insights to others facing similar problems.