基于gpu的低剂量x线CT图像重构框架

Xiuhong Li, Yun Liang, Wentai Zhang, Taide Liu, Haochen Li, Guojie Luo, M. Jiang
{"title":"基于gpu的低剂量x线CT图像重构框架","authors":"Xiuhong Li, Yun Liang, Wentai Zhang, Taide Liu, Haochen Li, Guojie Luo, M. Jiang","doi":"10.1145/3205289.3205309","DOIUrl":null,"url":null,"abstract":"Low-dose X-ray computed tomography (XCT) is a popular imaging technique to visualize the inside structure of object non-destructively. Model-based Iterative Reconstruction (MBIR) method can reconstruct high-quality image but at the cost of large computational demands. Therefore, MBIR of ten resorts to the platforms with hardware accelerators such as GPUs to speed up the reconstruction process. For MBIR, the reconstruction process is to minimize an objective function by updating image iteratively. The X-ray source emits large amounts of X-rays from various views to cover the object as much as possible. Different X-rays always have complex and irregular geometric relationship. This inherent irregularity makes the minimization process of the objective function on GPUs very challenging. First, different implementations of the minimization of objective function have different impacts on the convergence and GPU resource utilization. To this end, we explore different solvers to the minimization problem and different parallelism granularities for GPU kernel design. Second, the complex and irregular geometric relationship of X-rays introduces irregular memory behaviors. Two nearby X-rays may intersect and thus incur memory collisions, while two far away X-rays may incur non-coalesced memory accesses. We design a unified thread mapping algorithm to guide the mapping from X-rays to threads, which can optimize the memory collisions and non-coalesced memory accesses together. Finally, we present a series of architecture level optimizations to fully release the horse power of GPUs. Evaluation results demonstrate that cuMBIR can achieve 1.48X speedup over the state-of-the-art implementation on GPUs.","PeriodicalId":441217,"journal":{"name":"Proceedings of the 2018 International Conference on Supercomputing","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":"{\"title\":\"cuMBIR: An Efficient Framework for Low-dose X-ray CT Image Reconstruction on GPUs\",\"authors\":\"Xiuhong Li, Yun Liang, Wentai Zhang, Taide Liu, Haochen Li, Guojie Luo, M. Jiang\",\"doi\":\"10.1145/3205289.3205309\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Low-dose X-ray computed tomography (XCT) is a popular imaging technique to visualize the inside structure of object non-destructively. Model-based Iterative Reconstruction (MBIR) method can reconstruct high-quality image but at the cost of large computational demands. Therefore, MBIR of ten resorts to the platforms with hardware accelerators such as GPUs to speed up the reconstruction process. For MBIR, the reconstruction process is to minimize an objective function by updating image iteratively. The X-ray source emits large amounts of X-rays from various views to cover the object as much as possible. Different X-rays always have complex and irregular geometric relationship. This inherent irregularity makes the minimization process of the objective function on GPUs very challenging. First, different implementations of the minimization of objective function have different impacts on the convergence and GPU resource utilization. To this end, we explore different solvers to the minimization problem and different parallelism granularities for GPU kernel design. Second, the complex and irregular geometric relationship of X-rays introduces irregular memory behaviors. Two nearby X-rays may intersect and thus incur memory collisions, while two far away X-rays may incur non-coalesced memory accesses. We design a unified thread mapping algorithm to guide the mapping from X-rays to threads, which can optimize the memory collisions and non-coalesced memory accesses together. Finally, we present a series of architecture level optimizations to fully release the horse power of GPUs. Evaluation results demonstrate that cuMBIR can achieve 1.48X speedup over the state-of-the-art implementation on GPUs.\",\"PeriodicalId\":441217,\"journal\":{\"name\":\"Proceedings of the 2018 International Conference on Supercomputing\",\"volume\":\"19 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-06-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2018 International Conference on Supercomputing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3205289.3205309\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2018 International Conference on Supercomputing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3205289.3205309","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12

摘要

低剂量x射线计算机断层扫描(XCT)是一种常用的对物体内部结构进行无损成像的成像技术。基于模型的迭代重建(MBIR)方法可以重建出高质量的图像,但需要大量的计算量。因此,MBIR通常会借助带有gpu等硬件加速器的平台来加快重建过程。对于MBIR,重建过程是通过迭代更新图像来最小化目标函数。x射线源从不同的角度发射大量的x射线,以尽可能多地覆盖物体。不同的x射线总是具有复杂而不规则的几何关系。这种固有的不规则性使得gpu上目标函数的最小化过程非常具有挑战性。首先,目标函数最小化的不同实现方式对收敛性和GPU资源利用率有不同的影响。为此,我们探索了GPU内核设计中最小化问题的不同求解方法和不同并行度粒度。其次,x射线复杂而不规则的几何关系引入了不规则的记忆行为。两个附近的x射线可能相交,从而导致内存碰撞,而两个遥远的x射线可能导致非合并内存访问。我们设计了一个统一的线程映射算法来引导x射线到线程的映射,可以同时优化内存冲突和非合并内存访问。最后,我们提出了一系列架构级优化,以充分释放gpu的马力。评估结果表明,与gpu上最先进的实现相比,cuMBIR可以实现1.48倍的加速。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
cuMBIR: An Efficient Framework for Low-dose X-ray CT Image Reconstruction on GPUs
Low-dose X-ray computed tomography (XCT) is a popular imaging technique to visualize the inside structure of object non-destructively. Model-based Iterative Reconstruction (MBIR) method can reconstruct high-quality image but at the cost of large computational demands. Therefore, MBIR of ten resorts to the platforms with hardware accelerators such as GPUs to speed up the reconstruction process. For MBIR, the reconstruction process is to minimize an objective function by updating image iteratively. The X-ray source emits large amounts of X-rays from various views to cover the object as much as possible. Different X-rays always have complex and irregular geometric relationship. This inherent irregularity makes the minimization process of the objective function on GPUs very challenging. First, different implementations of the minimization of objective function have different impacts on the convergence and GPU resource utilization. To this end, we explore different solvers to the minimization problem and different parallelism granularities for GPU kernel design. Second, the complex and irregular geometric relationship of X-rays introduces irregular memory behaviors. Two nearby X-rays may intersect and thus incur memory collisions, while two far away X-rays may incur non-coalesced memory accesses. We design a unified thread mapping algorithm to guide the mapping from X-rays to threads, which can optimize the memory collisions and non-coalesced memory accesses together. Finally, we present a series of architecture level optimizations to fully release the horse power of GPUs. Evaluation results demonstrate that cuMBIR can achieve 1.48X speedup over the state-of-the-art implementation on GPUs.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信