Wentai Zhang, Li Shen, Thomas Page, Guojie Luo, Peng Li, P. Maass, M. Jiang, J. Cong
{"title":"基于Mumford-Shah正则化的FPGA同步图像重构与分割","authors":"Wentai Zhang, Li Shen, Thomas Page, Guojie Luo, Peng Li, P. Maass, M. Jiang, J. Cong","doi":"10.1145/2684746.2689097","DOIUrl":null,"url":null,"abstract":"X-ray computed tomography is an important technique for clinical diagnose and nondestructive testing. In many applications a number of image processing steps are needed before the image information becomes useful. Image segmentation is one of such processing steps and has important applications. The conventional flow is to first reconstruct the image and then obtain image segmentation afterwards. In contrast, an iterative method for simultaneous reconstruction and segmentation (SRS) with Mumford-Shah model has been proposed, which not only regularizes the ill-posedness of the tomographic reconstruction problem, but also produces the image segmentation at the same time. The Mumford-Shah model is both mathematically and computationally difficult. In this paper, we propose a data-decomposed algorithm of the SRS method, accelerate it using FPGA devices. The proposed algorithm has a structure that invokes a single kernel many times without involving other computational tasks. Though this structure seems best fit on GPU-like devices, experimental results show that a 73X, 11X, and 1.4X speedup can be achieved by the FPGA acceleration over the CPU implementation of the original SRS algorithm and ray-parallel SRS algorithm, and the GPU implementation of the ray-parallel SRS.","PeriodicalId":388546,"journal":{"name":"Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"FPGA Acceleration for Simultaneous Image Reconstruction and Segmentation based on the Mumford-Shah Regularization (Abstract Only)\",\"authors\":\"Wentai Zhang, Li Shen, Thomas Page, Guojie Luo, Peng Li, P. Maass, M. Jiang, J. Cong\",\"doi\":\"10.1145/2684746.2689097\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"X-ray computed tomography is an important technique for clinical diagnose and nondestructive testing. In many applications a number of image processing steps are needed before the image information becomes useful. Image segmentation is one of such processing steps and has important applications. The conventional flow is to first reconstruct the image and then obtain image segmentation afterwards. In contrast, an iterative method for simultaneous reconstruction and segmentation (SRS) with Mumford-Shah model has been proposed, which not only regularizes the ill-posedness of the tomographic reconstruction problem, but also produces the image segmentation at the same time. The Mumford-Shah model is both mathematically and computationally difficult. In this paper, we propose a data-decomposed algorithm of the SRS method, accelerate it using FPGA devices. The proposed algorithm has a structure that invokes a single kernel many times without involving other computational tasks. Though this structure seems best fit on GPU-like devices, experimental results show that a 73X, 11X, and 1.4X speedup can be achieved by the FPGA acceleration over the CPU implementation of the original SRS algorithm and ray-parallel SRS algorithm, and the GPU implementation of the ray-parallel SRS.\",\"PeriodicalId\":388546,\"journal\":{\"name\":\"Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays\",\"volume\":\"6 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-02-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2684746.2689097\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2684746.2689097","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
FPGA Acceleration for Simultaneous Image Reconstruction and Segmentation based on the Mumford-Shah Regularization (Abstract Only)
X-ray computed tomography is an important technique for clinical diagnose and nondestructive testing. In many applications a number of image processing steps are needed before the image information becomes useful. Image segmentation is one of such processing steps and has important applications. The conventional flow is to first reconstruct the image and then obtain image segmentation afterwards. In contrast, an iterative method for simultaneous reconstruction and segmentation (SRS) with Mumford-Shah model has been proposed, which not only regularizes the ill-posedness of the tomographic reconstruction problem, but also produces the image segmentation at the same time. The Mumford-Shah model is both mathematically and computationally difficult. In this paper, we propose a data-decomposed algorithm of the SRS method, accelerate it using FPGA devices. The proposed algorithm has a structure that invokes a single kernel many times without involving other computational tasks. Though this structure seems best fit on GPU-like devices, experimental results show that a 73X, 11X, and 1.4X speedup can be achieved by the FPGA acceleration over the CPU implementation of the original SRS algorithm and ray-parallel SRS algorithm, and the GPU implementation of the ray-parallel SRS.