{"title":"Concurrent systolic architecture for high-throughput implementation of 3-dimensional discrete wavelet transform","authors":"B. K. Mohanty, P. Meher","doi":"10.1109/ASAP.2008.4580172","DOIUrl":null,"url":null,"abstract":"In this paper, we present a novel systolic architecture for high-throughput computation of 3-dimensional (3-D) discrete wavelet transform (DWT). The entire 3-D DWT computation is decomposed into three distinct stages and implemented concurrently in a linear array of fully pipelined processing elements (PE). The proposed structure for 3-D DWT provides higher throughput than the existing architecture; and involves nearly half or less the number of multipliers and adders; and less on-chip memory (when normalized for unit throughput rate) than the other. Most importantly, the proposed one does not require any frame buffer unlike the other to perform inter-frame DWT computation. The proposed structure has a small latency and can perform 3-D DWT computation with 100% hardware unitization efficiency.","PeriodicalId":246715,"journal":{"name":"2008 International Conference on Application-Specific Systems, Architectures and Processors","volume":"123 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 International Conference on Application-Specific Systems, Architectures and Processors","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASAP.2008.4580172","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
In this paper, we present a novel systolic architecture for high-throughput computation of 3-dimensional (3-D) discrete wavelet transform (DWT). The entire 3-D DWT computation is decomposed into three distinct stages and implemented concurrently in a linear array of fully pipelined processing elements (PE). The proposed structure for 3-D DWT provides higher throughput than the existing architecture; and involves nearly half or less the number of multipliers and adders; and less on-chip memory (when normalized for unit throughput rate) than the other. Most importantly, the proposed one does not require any frame buffer unlike the other to perform inter-frame DWT computation. The proposed structure has a small latency and can perform 3-D DWT computation with 100% hardware unitization efficiency.