{"title":"基于核分解方案的3-D DIRECT TOF PET重构中长核卷积的高效gpu加速","authors":"S. Ha, Zhiyuan Zhang, K. Mueller, S. Matej","doi":"10.1109/NSSMIC.2010.5874319","DOIUrl":null,"url":null,"abstract":"The DIRECT approach for 3-D Time-of-Flight (TOF) PET reconstruction performs all iterative predictor-corrector operations directly in image space. A computational bottleneck here is the convolution with the long TOF (resolution) kernels. Accelerating this convolution operation using GPUs is very important especially for spatially variant resolution kernels, which cannot be efficiently implemented in the Fourier domain. The main challenge here is the memory cache performance at non-axis aligned directions. We devised a scheme that first re-samples the image into an axis-aligned orientation offering good memory coherence for the convolution operations. In order to maintain good accuracy, we carefully design the resampling and new convolution kernels to combine into the original TOF kernel. This paper demonstrates the validity, accuracy, and high speed-performance of our scheme for a comprehensive set of orientation angles. Future work will apply these cascaded kernels within a GPU-accelerated version of DIRECT.","PeriodicalId":13048,"journal":{"name":"IEEE Nuclear Science Symposuim & Medical Imaging Conference","volume":"392 1","pages":"2866-2867"},"PeriodicalIF":0.0000,"publicationDate":"2010-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Efficiently GPU-accelerating long kernel convolutions in 3-D DIRECT TOF PET reconstruction via a kernel decomposition scheme\",\"authors\":\"S. Ha, Zhiyuan Zhang, K. Mueller, S. Matej\",\"doi\":\"10.1109/NSSMIC.2010.5874319\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The DIRECT approach for 3-D Time-of-Flight (TOF) PET reconstruction performs all iterative predictor-corrector operations directly in image space. A computational bottleneck here is the convolution with the long TOF (resolution) kernels. Accelerating this convolution operation using GPUs is very important especially for spatially variant resolution kernels, which cannot be efficiently implemented in the Fourier domain. The main challenge here is the memory cache performance at non-axis aligned directions. We devised a scheme that first re-samples the image into an axis-aligned orientation offering good memory coherence for the convolution operations. In order to maintain good accuracy, we carefully design the resampling and new convolution kernels to combine into the original TOF kernel. This paper demonstrates the validity, accuracy, and high speed-performance of our scheme for a comprehensive set of orientation angles. Future work will apply these cascaded kernels within a GPU-accelerated version of DIRECT.\",\"PeriodicalId\":13048,\"journal\":{\"name\":\"IEEE Nuclear Science Symposuim & Medical Imaging Conference\",\"volume\":\"392 1\",\"pages\":\"2866-2867\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Nuclear Science Symposuim & Medical Imaging Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/NSSMIC.2010.5874319\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Nuclear Science Symposuim & Medical Imaging Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NSSMIC.2010.5874319","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Efficiently GPU-accelerating long kernel convolutions in 3-D DIRECT TOF PET reconstruction via a kernel decomposition scheme
The DIRECT approach for 3-D Time-of-Flight (TOF) PET reconstruction performs all iterative predictor-corrector operations directly in image space. A computational bottleneck here is the convolution with the long TOF (resolution) kernels. Accelerating this convolution operation using GPUs is very important especially for spatially variant resolution kernels, which cannot be efficiently implemented in the Fourier domain. The main challenge here is the memory cache performance at non-axis aligned directions. We devised a scheme that first re-samples the image into an axis-aligned orientation offering good memory coherence for the convolution operations. In order to maintain good accuracy, we carefully design the resampling and new convolution kernels to combine into the original TOF kernel. This paper demonstrates the validity, accuracy, and high speed-performance of our scheme for a comprehensive set of orientation angles. Future work will apply these cascaded kernels within a GPU-accelerated version of DIRECT.