{"title":"使用开放计算语言的并行离散小波变换:性能和可移植性研究","authors":"Bharatkumar Sharma, N. Vydyanathan","doi":"10.1109/IPDPSW.2010.5470830","DOIUrl":null,"url":null,"abstract":"The discrete wavelet transform (DWT) is a powerful signal processing technique used in the JPEG 2000 image compression standard. The multi-resolution sub-band encoding provided by DWT allows for higher compression ratios, avoids blocking artifacts and enables progressive transmission of images. However, these advantages come at the expense of additional computational complexity. Achieving real-time or interactive compression/de-compression speeds, therefore, requires a fast implementation of DWT that leverages emerging parallel hardware systems. In this paper, we develop an optimized parallel implementation of the lifting-based DWT algorithm using the recently proposed Open Computing Language (OpenCL). OpenCL is a standard for cross-platform parallel programming of heterogeneous systems comprising of multi-core CPUs, GPUs and other accelerators. We explore the potential of OpenCL in accelerating the DWT computation and analyze the programmability, portability and performance aspects of this language. Our experimental analysis is done using NVIDIA's and AMD's drivers that support OpenCL.","PeriodicalId":329280,"journal":{"name":"2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":"{\"title\":\"Parallel discrete wavelet transform using the Open Computing Language: a performance and portability study\",\"authors\":\"Bharatkumar Sharma, N. Vydyanathan\",\"doi\":\"10.1109/IPDPSW.2010.5470830\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The discrete wavelet transform (DWT) is a powerful signal processing technique used in the JPEG 2000 image compression standard. The multi-resolution sub-band encoding provided by DWT allows for higher compression ratios, avoids blocking artifacts and enables progressive transmission of images. However, these advantages come at the expense of additional computational complexity. Achieving real-time or interactive compression/de-compression speeds, therefore, requires a fast implementation of DWT that leverages emerging parallel hardware systems. In this paper, we develop an optimized parallel implementation of the lifting-based DWT algorithm using the recently proposed Open Computing Language (OpenCL). OpenCL is a standard for cross-platform parallel programming of heterogeneous systems comprising of multi-core CPUs, GPUs and other accelerators. We explore the potential of OpenCL in accelerating the DWT computation and analyze the programmability, portability and performance aspects of this language. Our experimental analysis is done using NVIDIA's and AMD's drivers that support OpenCL.\",\"PeriodicalId\":329280,\"journal\":{\"name\":\"2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW)\",\"volume\":\"13 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-04-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"14\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IPDPSW.2010.5470830\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPSW.2010.5470830","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Parallel discrete wavelet transform using the Open Computing Language: a performance and portability study
The discrete wavelet transform (DWT) is a powerful signal processing technique used in the JPEG 2000 image compression standard. The multi-resolution sub-band encoding provided by DWT allows for higher compression ratios, avoids blocking artifacts and enables progressive transmission of images. However, these advantages come at the expense of additional computational complexity. Achieving real-time or interactive compression/de-compression speeds, therefore, requires a fast implementation of DWT that leverages emerging parallel hardware systems. In this paper, we develop an optimized parallel implementation of the lifting-based DWT algorithm using the recently proposed Open Computing Language (OpenCL). OpenCL is a standard for cross-platform parallel programming of heterogeneous systems comprising of multi-core CPUs, GPUs and other accelerators. We explore the potential of OpenCL in accelerating the DWT computation and analyze the programmability, portability and performance aspects of this language. Our experimental analysis is done using NVIDIA's and AMD's drivers that support OpenCL.