Thomas White , Tim Schoof , Sergey Yakubov , Aleksandra Tolstikova , Philipp Middendorf , Mikhail Karnevskiy , Valerio Mariani , Alessandra Henkel , Bjarne Klopprogge , Juergen Hannappel , Dominik Oberthuer , Ivan De Gennaro Aquino , Dmitry Egorov , Anna Munke , Janina Sprenger , Guillaume Pompidor , Helena Taberman , Andrey Gruzinov , Jan Meyer , Johanna Hakanpää , Martin Gasthuber
{"title":"连续晶体学实验的实时数据处理。","authors":"Thomas White , Tim Schoof , Sergey Yakubov , Aleksandra Tolstikova , Philipp Middendorf , Mikhail Karnevskiy , Valerio Mariani , Alessandra Henkel , Bjarne Klopprogge , Juergen Hannappel , Dominik Oberthuer , Ivan De Gennaro Aquino , Dmitry Egorov , Anna Munke , Janina Sprenger , Guillaume Pompidor , Helena Taberman , Andrey Gruzinov , Jan Meyer , Johanna Hakanpää , Martin Gasthuber","doi":"10.1107/S2052252524011837","DOIUrl":null,"url":null,"abstract":"<div><div>We report the use of streaming data interfaces to process data in real time from serial crystallography experiments, with a latency of less than 1 s per frame and without requiring intermediate data storage on disk.</div></div><div><div>We report the use of streaming data interfaces to perform fully online data processing for serial crystallography experiments, without storing intermediate data on disk. The system produces Bragg reflection intensity measurements suitable for scaling and merging, with a latency of less than 1 s per frame. Our system uses the <em>CrystFEL</em> software in combination with the ASAP::O data framework. In a series of user experiments at PETRA III, frames from a 16 megapixel Dectris EIGER2 X detector were searched for peaks, indexed and integrated at the maximum full-frame readout speed of 133 frames per second. The computational resources required depend on various factors, most significantly the fraction of non-blank frames (‘hits’). The average single-thread processing time per frame was 242 ms for blank frames and 455 ms for hits, meaning that a single 96-core computing node was sufficient to keep up with the data, with ample headroom for unexpected throughput reductions. Further significant improvements are expected, for example by binning pixel intensities together to reduce the pixel count. We discuss the implications of real-time data processing on the ‘data deluge’ problem from recent and future photon-science experiments, in particular on calibration requirements, computing access patterns and the need for the preservation of raw data.</div></div>","PeriodicalId":14775,"journal":{"name":"IUCrJ","volume":"12 1","pages":"Pages 97-108"},"PeriodicalIF":2.9000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11707691/pdf/","citationCount":"0","resultStr":"{\"title\":\"Real-time data processing for serial crystallography experiments\",\"authors\":\"Thomas White , Tim Schoof , Sergey Yakubov , Aleksandra Tolstikova , Philipp Middendorf , Mikhail Karnevskiy , Valerio Mariani , Alessandra Henkel , Bjarne Klopprogge , Juergen Hannappel , Dominik Oberthuer , Ivan De Gennaro Aquino , Dmitry Egorov , Anna Munke , Janina Sprenger , Guillaume Pompidor , Helena Taberman , Andrey Gruzinov , Jan Meyer , Johanna Hakanpää , Martin Gasthuber\",\"doi\":\"10.1107/S2052252524011837\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>We report the use of streaming data interfaces to process data in real time from serial crystallography experiments, with a latency of less than 1 s per frame and without requiring intermediate data storage on disk.</div></div><div><div>We report the use of streaming data interfaces to perform fully online data processing for serial crystallography experiments, without storing intermediate data on disk. The system produces Bragg reflection intensity measurements suitable for scaling and merging, with a latency of less than 1 s per frame. Our system uses the <em>CrystFEL</em> software in combination with the ASAP::O data framework. In a series of user experiments at PETRA III, frames from a 16 megapixel Dectris EIGER2 X detector were searched for peaks, indexed and integrated at the maximum full-frame readout speed of 133 frames per second. The computational resources required depend on various factors, most significantly the fraction of non-blank frames (‘hits’). The average single-thread processing time per frame was 242 ms for blank frames and 455 ms for hits, meaning that a single 96-core computing node was sufficient to keep up with the data, with ample headroom for unexpected throughput reductions. Further significant improvements are expected, for example by binning pixel intensities together to reduce the pixel count. We discuss the implications of real-time data processing on the ‘data deluge’ problem from recent and future photon-science experiments, in particular on calibration requirements, computing access patterns and the need for the preservation of raw data.</div></div>\",\"PeriodicalId\":14775,\"journal\":{\"name\":\"IUCrJ\",\"volume\":\"12 1\",\"pages\":\"Pages 97-108\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2025-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11707691/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IUCrJ\",\"FirstCategoryId\":\"88\",\"ListUrlMain\":\"https://www.sciencedirect.com/org/science/article/pii/S2052252525000065\",\"RegionNum\":2,\"RegionCategory\":\"材料科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"CHEMISTRY, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IUCrJ","FirstCategoryId":"88","ListUrlMain":"https://www.sciencedirect.com/org/science/article/pii/S2052252525000065","RegionNum":2,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
Real-time data processing for serial crystallography experiments
We report the use of streaming data interfaces to process data in real time from serial crystallography experiments, with a latency of less than 1 s per frame and without requiring intermediate data storage on disk.
We report the use of streaming data interfaces to perform fully online data processing for serial crystallography experiments, without storing intermediate data on disk. The system produces Bragg reflection intensity measurements suitable for scaling and merging, with a latency of less than 1 s per frame. Our system uses the CrystFEL software in combination with the ASAP::O data framework. In a series of user experiments at PETRA III, frames from a 16 megapixel Dectris EIGER2 X detector were searched for peaks, indexed and integrated at the maximum full-frame readout speed of 133 frames per second. The computational resources required depend on various factors, most significantly the fraction of non-blank frames (‘hits’). The average single-thread processing time per frame was 242 ms for blank frames and 455 ms for hits, meaning that a single 96-core computing node was sufficient to keep up with the data, with ample headroom for unexpected throughput reductions. Further significant improvements are expected, for example by binning pixel intensities together to reduce the pixel count. We discuss the implications of real-time data processing on the ‘data deluge’ problem from recent and future photon-science experiments, in particular on calibration requirements, computing access patterns and the need for the preservation of raw data.
期刊介绍:
IUCrJ is a new fully open-access peer-reviewed journal from the International Union of Crystallography (IUCr).
The journal will publish high-profile articles on all aspects of the sciences and technologies supported by the IUCr via its commissions, including emerging fields where structural results underpin the science reported in the article. Our aim is to make IUCrJ the natural home for high-quality structural science results. Chemists, biologists, physicists and material scientists will be actively encouraged to report their structural studies in IUCrJ.