TORO Indexer: a PyTorch-based indexing algorithm for kilohertz serial crystallography.

IF 6.1 3区 材料科学 Q1 Biochemistry, Genetics and Molecular Biology
Journal of Applied Crystallography Pub Date : 2024-06-18 eCollection Date: 2024-08-01 DOI:10.1107/S1600576724003182
Piero Gasparotto, Luis Barba, Hans-Christian Stadler, Greta Assmann, Henrique Mendonça, Alun W Ashton, Markus Janousch, Filip Leonarski, Benjamín Béjar
{"title":"<i>TORO Indexer</i>: a <i>PyTorch</i>-based indexing algorithm for kilohertz serial crystallography.","authors":"Piero Gasparotto, Luis Barba, Hans-Christian Stadler, Greta Assmann, Henrique Mendonça, Alun W Ashton, Markus Janousch, Filip Leonarski, Benjamín Béjar","doi":"10.1107/S1600576724003182","DOIUrl":null,"url":null,"abstract":"<p><p>Serial crystallography (SX) involves combining observations from a very large number of diffraction patterns coming from crystals in random orientations. To compile a complete data set, these patterns must be indexed (<i>i.e.</i> their orientation determined), integrated and merged. Introduced here is <i>TORO</i> (<i>Torch</i>-powered robust optimization) <i>Indexer</i>, a robust and adaptable indexing algorithm developed using the <i>PyTorch</i> framework. <i>TORO</i> is capable of operating on graphics processing units (GPUs), central processing units (CPUs) and other hardware accelerators supported by <i>PyTorch</i>, ensuring compatibility with a wide variety of computational setups. In tests, <i>TORO</i> outpaces existing solutions, indexing thousands of frames per second when running on GPUs, which positions it as an attractive candidate to produce real-time indexing and user feedback. The algorithm streamlines some of the ideas introduced by previous indexers like <i>DIALS</i> real-space grid search [Gildea, Waterman, Parkhurst, Axford, Sutton, Stuart, Sauter, Evans & Winter (2014). <i>Acta Cryst.</i> D<b>70</b>, 2652-2666] and <i>XGandalf</i> [Gevorkov, Yefanov, Barty, White, Mariani, Brehm, Tolstikova, Grigat & Chapman (2019). <i>Acta Cryst.</i> A<b>75</b>, 694-704] and refines them using faster and principled robust optimization techniques which result in a concise code base consisting of less than 500 lines. On the basis of evaluations across four proteins, <i>TORO</i> consistently matches, and in certain instances outperforms, established algorithms such as <i>XGandalf</i> and <i>MOSFLM</i> [Powell (1999). <i>Acta Cryst.</i> D<b>55</b>, 1690-1695], occasionally amplifying the quality of the consolidated data while achieving superior indexing speed. The inherent modularity of <i>TORO</i> and the versatility of <i>PyTorch</i> code bases facilitate its deployment into a wide array of architectures, software platforms and bespoke applications, highlighting its prospective significance in SX.</p>","PeriodicalId":14950,"journal":{"name":"Journal of Applied Crystallography","volume":"57 Pt 4","pages":"931-944"},"PeriodicalIF":6.1000,"publicationDate":"2024-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11299607/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Applied Crystallography","FirstCategoryId":"88","ListUrlMain":"https://doi.org/10.1107/S1600576724003182","RegionNum":3,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/8/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"Biochemistry, Genetics and Molecular Biology","Score":null,"Total":0}
引用次数: 0

Abstract

Serial crystallography (SX) involves combining observations from a very large number of diffraction patterns coming from crystals in random orientations. To compile a complete data set, these patterns must be indexed (i.e. their orientation determined), integrated and merged. Introduced here is TORO (Torch-powered robust optimization) Indexer, a robust and adaptable indexing algorithm developed using the PyTorch framework. TORO is capable of operating on graphics processing units (GPUs), central processing units (CPUs) and other hardware accelerators supported by PyTorch, ensuring compatibility with a wide variety of computational setups. In tests, TORO outpaces existing solutions, indexing thousands of frames per second when running on GPUs, which positions it as an attractive candidate to produce real-time indexing and user feedback. The algorithm streamlines some of the ideas introduced by previous indexers like DIALS real-space grid search [Gildea, Waterman, Parkhurst, Axford, Sutton, Stuart, Sauter, Evans & Winter (2014). Acta Cryst. D70, 2652-2666] and XGandalf [Gevorkov, Yefanov, Barty, White, Mariani, Brehm, Tolstikova, Grigat & Chapman (2019). Acta Cryst. A75, 694-704] and refines them using faster and principled robust optimization techniques which result in a concise code base consisting of less than 500 lines. On the basis of evaluations across four proteins, TORO consistently matches, and in certain instances outperforms, established algorithms such as XGandalf and MOSFLM [Powell (1999). Acta Cryst. D55, 1690-1695], occasionally amplifying the quality of the consolidated data while achieving superior indexing speed. The inherent modularity of TORO and the versatility of PyTorch code bases facilitate its deployment into a wide array of architectures, software platforms and bespoke applications, highlighting its prospective significance in SX.

TORO Indexer:基于 PyTorch 的千赫兹序列晶体学索引算法。
串行晶体学(SX)涉及将来自随机取向晶体的大量衍射图样的观测结果进行合并。要汇编一个完整的数据集,必须对这些衍射图样进行索引(即确定它们的方向)、整合和合并。这里介绍的是 TORO(Torch-powered robust optimization,火炬驱动的鲁棒优化)索引器,它是一种使用 PyTorch 框架开发的鲁棒且适应性强的索引算法。TORO 能够在图形处理器(GPU)、中央处理器(CPU)和 PyTorch 支持的其他硬件加速器上运行,确保与各种计算设置兼容。在测试中,TORO 超越了现有的解决方案,在 GPU 上运行时每秒可索引数千帧图像,这使它成为产生实时索引和用户反馈的一个有吸引力的候选方案。该算法简化了之前的索引器(如 DIALS 真实空间网格搜索)引入的一些想法[Gildea、Waterman、Parkhurst、Axford、Sutton、Stuart、Sauter、Evans & Winter (2014)。Acta Cryst.D70, 2652-2666] 和 XGandalf [Gevorkov, Yefanov, Barty, White, Mariani, Brehm, Tolstikova, Grigat & Chapman (2019).Acta Cryst.A75,694-704],并使用更快、更有原则的稳健优化技术对其进行改进,最终形成了一个不到 500 行的简洁代码库。在对四种蛋白质进行评估的基础上,TORO 始终与 XGandalf 和 MOSFLM [Powell (1999). Acta Cryst. D55, 1690-1695] 等成熟算法不相上下,在某些情况下甚至优于它们,偶尔还能提高合并数据的质量,同时实现卓越的索引速度。TORO 固有的模块性和 PyTorch 代码库的通用性使其可以部署到各种体系结构、软件平台和定制应用中,突出了其在 SX 领域的重要前景。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
10.00
自引率
3.30%
发文量
178
审稿时长
4.7 months
期刊介绍: Many research topics in condensed matter research, materials science and the life sciences make use of crystallographic methods to study crystalline and non-crystalline matter with neutrons, X-rays and electrons. Articles published in the Journal of Applied Crystallography focus on these methods and their use in identifying structural and diffusion-controlled phase transformations, structure-property relationships, structural changes of defects, interfaces and surfaces, etc. Developments of instrumentation and crystallographic apparatus, theory and interpretation, numerical analysis and other related subjects are also covered. The journal is the primary place where crystallographic computer program information is published.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信