{"title":"三维离散小波变换高吞吐量实现的并发收缩结构","authors":"B. K. Mohanty, P. Meher","doi":"10.1109/ASAP.2008.4580172","DOIUrl":null,"url":null,"abstract":"In this paper, we present a novel systolic architecture for high-throughput computation of 3-dimensional (3-D) discrete wavelet transform (DWT). The entire 3-D DWT computation is decomposed into three distinct stages and implemented concurrently in a linear array of fully pipelined processing elements (PE). The proposed structure for 3-D DWT provides higher throughput than the existing architecture; and involves nearly half or less the number of multipliers and adders; and less on-chip memory (when normalized for unit throughput rate) than the other. Most importantly, the proposed one does not require any frame buffer unlike the other to perform inter-frame DWT computation. The proposed structure has a small latency and can perform 3-D DWT computation with 100% hardware unitization efficiency.","PeriodicalId":246715,"journal":{"name":"2008 International Conference on Application-Specific Systems, Architectures and Processors","volume":"123 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Concurrent systolic architecture for high-throughput implementation of 3-dimensional discrete wavelet transform\",\"authors\":\"B. K. Mohanty, P. Meher\",\"doi\":\"10.1109/ASAP.2008.4580172\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we present a novel systolic architecture for high-throughput computation of 3-dimensional (3-D) discrete wavelet transform (DWT). The entire 3-D DWT computation is decomposed into three distinct stages and implemented concurrently in a linear array of fully pipelined processing elements (PE). The proposed structure for 3-D DWT provides higher throughput than the existing architecture; and involves nearly half or less the number of multipliers and adders; and less on-chip memory (when normalized for unit throughput rate) than the other. Most importantly, the proposed one does not require any frame buffer unlike the other to perform inter-frame DWT computation. The proposed structure has a small latency and can perform 3-D DWT computation with 100% hardware unitization efficiency.\",\"PeriodicalId\":246715,\"journal\":{\"name\":\"2008 International Conference on Application-Specific Systems, Architectures and Processors\",\"volume\":\"123 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-07-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2008 International Conference on Application-Specific Systems, Architectures and Processors\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ASAP.2008.4580172\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 International Conference on Application-Specific Systems, Architectures and Processors","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASAP.2008.4580172","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Concurrent systolic architecture for high-throughput implementation of 3-dimensional discrete wavelet transform
In this paper, we present a novel systolic architecture for high-throughput computation of 3-dimensional (3-D) discrete wavelet transform (DWT). The entire 3-D DWT computation is decomposed into three distinct stages and implemented concurrently in a linear array of fully pipelined processing elements (PE). The proposed structure for 3-D DWT provides higher throughput than the existing architecture; and involves nearly half or less the number of multipliers and adders; and less on-chip memory (when normalized for unit throughput rate) than the other. Most importantly, the proposed one does not require any frame buffer unlike the other to perform inter-frame DWT computation. The proposed structure has a small latency and can perform 3-D DWT computation with 100% hardware unitization efficiency.