{"title":"用于漫射光学断层成像的高性能单和多gpu加速","authors":"M. Saikia, R. Kanhirodan","doi":"10.1109/IC3I.2014.7019809","DOIUrl":null,"url":null,"abstract":"Diffuse Optical Tomography (DOT) is a diagnostic imaging modality, where optical parameters such as absorption and scattering coefficient distributions inside the living tissue are recovered to understand the structural and functional variations in the tissue under study. The numerical method of DOT image reconstruction is an iterative process that demands high computational power, especially in the case of recovering fully three dimensional (3D) optical property distribution inside a complex geometry such as human head which hampers physician to view reconstructed images and monitor a patient in real time. In order to reconstruct 3D DOT images at a high speed, Broyden method based iterative image reconstruction algorithm and a parallelization strategy are employed in CUDA parallel computing platform to utilize tremendous computational power of GPU. Three different single GPU systems equipped with Nvidia Tesla C2070, Tesla k20c and Tesla k40 respectively, and a muti-GPU (two Tesla M2090 GPUs) in a computing node in a HPC cluster are used to evaluate computation performance due to algorithmic improvement and GPU parallel computation. We have used three dimensional finite element method (FEM) and discretized an infant head into 45702 tetrahedral elements and 8703 nodes to solve the forward and inverse problems. We have achieved a significant speedup for the 3D DOT image reconstruction of the head phantom.","PeriodicalId":430848,"journal":{"name":"2014 International Conference on Contemporary Computing and Informatics (IC3I)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":"{\"title\":\"High performance single and multi-GPU acceleration for Diffuse Optical Tomography\",\"authors\":\"M. Saikia, R. Kanhirodan\",\"doi\":\"10.1109/IC3I.2014.7019809\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Diffuse Optical Tomography (DOT) is a diagnostic imaging modality, where optical parameters such as absorption and scattering coefficient distributions inside the living tissue are recovered to understand the structural and functional variations in the tissue under study. The numerical method of DOT image reconstruction is an iterative process that demands high computational power, especially in the case of recovering fully three dimensional (3D) optical property distribution inside a complex geometry such as human head which hampers physician to view reconstructed images and monitor a patient in real time. In order to reconstruct 3D DOT images at a high speed, Broyden method based iterative image reconstruction algorithm and a parallelization strategy are employed in CUDA parallel computing platform to utilize tremendous computational power of GPU. Three different single GPU systems equipped with Nvidia Tesla C2070, Tesla k20c and Tesla k40 respectively, and a muti-GPU (two Tesla M2090 GPUs) in a computing node in a HPC cluster are used to evaluate computation performance due to algorithmic improvement and GPU parallel computation. We have used three dimensional finite element method (FEM) and discretized an infant head into 45702 tetrahedral elements and 8703 nodes to solve the forward and inverse problems. We have achieved a significant speedup for the 3D DOT image reconstruction of the head phantom.\",\"PeriodicalId\":430848,\"journal\":{\"name\":\"2014 International Conference on Contemporary Computing and Informatics (IC3I)\",\"volume\":\"43 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"17\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 International Conference on Contemporary Computing and Informatics (IC3I)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IC3I.2014.7019809\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 International Conference on Contemporary Computing and Informatics (IC3I)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IC3I.2014.7019809","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 17
摘要
漫射光学断层扫描(DOT)是一种诊断成像方式,通过恢复活体组织内部的吸收和散射系数分布等光学参数来了解被研究组织的结构和功能变化。DOT图像重建的数值方法是一个迭代的过程,需要很高的计算能力,特别是在恢复复杂几何结构(如人体头部)内部的三维光学特性分布的情况下,这阻碍了医生查看重建图像和实时监测患者。为了高速重建三维DOT图像,在CUDA并行计算平台上采用基于Broyden方法的迭代图像重建算法和并行化策略,充分利用GPU的巨大计算能力。采用Nvidia Tesla C2070、Tesla k20c和Tesla k40三种不同的单GPU系统,以及HPC集群中一个计算节点的多GPU(两个Tesla M2090 GPU),通过算法改进和GPU并行计算来评估计算性能。采用三维有限元方法,将婴儿头部离散为45702个四面体单元和8703个节点,求解了正逆问题。我们对头部幻影的3D DOT图像重建实现了显著的加速。
High performance single and multi-GPU acceleration for Diffuse Optical Tomography
Diffuse Optical Tomography (DOT) is a diagnostic imaging modality, where optical parameters such as absorption and scattering coefficient distributions inside the living tissue are recovered to understand the structural and functional variations in the tissue under study. The numerical method of DOT image reconstruction is an iterative process that demands high computational power, especially in the case of recovering fully three dimensional (3D) optical property distribution inside a complex geometry such as human head which hampers physician to view reconstructed images and monitor a patient in real time. In order to reconstruct 3D DOT images at a high speed, Broyden method based iterative image reconstruction algorithm and a parallelization strategy are employed in CUDA parallel computing platform to utilize tremendous computational power of GPU. Three different single GPU systems equipped with Nvidia Tesla C2070, Tesla k20c and Tesla k40 respectively, and a muti-GPU (two Tesla M2090 GPUs) in a computing node in a HPC cluster are used to evaluate computation performance due to algorithmic improvement and GPU parallel computation. We have used three dimensional finite element method (FEM) and discretized an infant head into 45702 tetrahedral elements and 8703 nodes to solve the forward and inverse problems. We have achieved a significant speedup for the 3D DOT image reconstruction of the head phantom.