一种用于二维DWT高吞吐量计算的面积高效VLSI架构

IF 2.8 2区工程技术 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IEEE Transactions on Very Large Scale Integration (VLSI) Systems Pub Date : 2025-01-27 DOI:10.1109/TVLSI.2025.3529690

Yuzhou Dai;Wei Zhang;Lin Shi;Qitao Li;Zhuolun Wu;Yanyan Liu

{"title":"一种用于二维DWT高吞吐量计算的面积高效VLSI架构","authors":"Yuzhou Dai;Wei Zhang;Lin Shi;Qitao Li;Zhuolun Wu;Yanyan Liu","doi":"10.1109/TVLSI.2025.3529690","DOIUrl":null,"url":null,"abstract":"In this article, an area-efficient VLSI architecture scheme for high-throughput computation of the 2-D discrete wavelet transform (DWT) is proposed, effectively applied in the context of aircraft cargo hold scenes. The proposed architecture aims to reduce computation and storage resources while maintaining the DWT-IDWT reconstructed image quality for the 9/7 discrete wavelet. The hardware implementation formulae based on the flipping architecture have been modified to reduce RAM storage bit width. By transforming the coefficients of the formula into hardware-friendly values, the required multiplication operations are split into two stages of addition. On this basis, a pipelined architecture is constructed to set the critical path delay (CPD) of the architecture to be close to the delay of a single adder, <inline-formula> <tex-math>$T_{a}$ </tex-math></inline-formula>, thereby achieving a high throughput. Compared to existing architectures in the research field, the proposed single-level 2-D DWT architecture achieves resource savings on the field-programmable gate array (FPGA) platform while ensuring good image reconstruction quality. The advantages of the multilevel 2-D DWT are even more pronounced. In the simulation results on the application-specific integrated circuit (ASIC) platform, the proposed architecture reduces computation time by at least 35.54% while achieving a higher level of decomposition, decreases the area-delay product (ADP) by at least 25.41%, and saves a significant amount of energy per image (EPI). Furthermore, the proposed folded architecture achieves close to 100% hardware utilization efficiency (HUE) in multilevel 2-D DWT computations.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 5","pages":"1292-1303"},"PeriodicalIF":2.8000,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An Area-Efficient VLSI Architecture for High-Throughput Computation of the 2-D DWT\",\"authors\":\"Yuzhou Dai;Wei Zhang;Lin Shi;Qitao Li;Zhuolun Wu;Yanyan Liu\",\"doi\":\"10.1109/TVLSI.2025.3529690\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this article, an area-efficient VLSI architecture scheme for high-throughput computation of the 2-D discrete wavelet transform (DWT) is proposed, effectively applied in the context of aircraft cargo hold scenes. The proposed architecture aims to reduce computation and storage resources while maintaining the DWT-IDWT reconstructed image quality for the 9/7 discrete wavelet. The hardware implementation formulae based on the flipping architecture have been modified to reduce RAM storage bit width. By transforming the coefficients of the formula into hardware-friendly values, the required multiplication operations are split into two stages of addition. On this basis, a pipelined architecture is constructed to set the critical path delay (CPD) of the architecture to be close to the delay of a single adder, <inline-formula> <tex-math>$T_{a}$ </tex-math></inline-formula>, thereby achieving a high throughput. Compared to existing architectures in the research field, the proposed single-level 2-D DWT architecture achieves resource savings on the field-programmable gate array (FPGA) platform while ensuring good image reconstruction quality. The advantages of the multilevel 2-D DWT are even more pronounced. In the simulation results on the application-specific integrated circuit (ASIC) platform, the proposed architecture reduces computation time by at least 35.54% while achieving a higher level of decomposition, decreases the area-delay product (ADP) by at least 25.41%, and saves a significant amount of energy per image (EPI). Furthermore, the proposed folded architecture achieves close to 100% hardware utilization efficiency (HUE) in multilevel 2-D DWT computations.\",\"PeriodicalId\":13425,\"journal\":{\"name\":\"IEEE Transactions on Very Large Scale Integration (VLSI) Systems\",\"volume\":\"33 5\",\"pages\":\"1292-1303\"},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2025-01-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Very Large Scale Integration (VLSI) Systems\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10855161/\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10855161/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

摘要

本文提出了一种用于二维离散小波变换（DWT）高吞吐量计算的面积高效VLSI架构方案，该方案可有效应用于飞机货舱场景。提出的架构旨在减少计算和存储资源，同时保持9/7离散小波的DWT-IDWT重构图像质量。修改了基于翻转结构的硬件实现公式，以减小RAM存储位宽。通过将公式的系数转换为硬件友好的值，所需的乘法运算被分成两个阶段的加法。在此基础上，构造了一个流水线架构，将该架构的关键路径延迟（CPD）设置为接近单个加法器的延迟$T_{a}$，从而实现高吞吐量。与研究领域现有的架构相比，本文提出的单级二维DWT架构在保证图像重建质量的同时，节省了现场可编程门阵列（FPGA）平台上的资源。多层二维DWT的优势更加明显。在专用集成电路（ASIC）平台上的仿真结果表明，该架构在实现更高的分解水平的同时，将计算时间缩短了至少35.54%，将面积延迟积（ADP）降低了至少25.41%，并节省了大量的每幅图像能量（EPI）。此外，所提出的折叠架构在多层二维DWT计算中实现了接近100%的硬件利用率（HUE）。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

An Area-Efficient VLSI Architecture for High-Throughput Computation of the 2-D DWT

In this article, an area-efficient VLSI architecture scheme for high-throughput computation of the 2-D discrete wavelet transform (DWT) is proposed, effectively applied in the context of aircraft cargo hold scenes. The proposed architecture aims to reduce computation and storage resources while maintaining the DWT-IDWT reconstructed image quality for the 9/7 discrete wavelet. The hardware implementation formulae based on the flipping architecture have been modified to reduce RAM storage bit width. By transforming the coefficients of the formula into hardware-friendly values, the required multiplication operations are split into two stages of addition. On this basis, a pipelined architecture is constructed to set the critical path delay (CPD) of the architecture to be close to the delay of a single adder,

$T_{a}$

, thereby achieving a high throughput. Compared to existing architectures in the research field, the proposed single-level 2-D DWT architecture achieves resource savings on the field-programmable gate array (FPGA) platform while ensuring good image reconstruction quality. The advantages of the multilevel 2-D DWT are even more pronounced. In the simulation results on the application-specific integrated circuit (ASIC) platform, the proposed architecture reduces computation time by at least 35.54% while achieving a higher level of decomposition, decreases the area-delay product (ADP) by at least 25.41%, and saves a significant amount of energy per image (EPI). Furthermore, the proposed folded architecture achieves close to 100% hardware utilization efficiency (HUE) in multilevel 2-D DWT computations.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Very Large Scale Integration (VLSI) Systems 工程技术-工程：电子与电气

CiteScore

6.40

自引率

7.10%

发文量

187

审稿时长

3.6 months

期刊介绍： The IEEE Transactions on VLSI Systems is published as a monthly journal under the co-sponsorship of the IEEE Circuits and Systems Society, the IEEE Computer Society, and the IEEE Solid-State Circuits Society. Design and realization of microelectronic systems using VLSI/ULSI technologies require close collaboration among scientists and engineers in the fields of systems architecture, logic and circuit design, chips and wafer fabrication, packaging, testing and systems applications. Generation of specifications, design and verification must be performed at all abstraction levels, including the system, register-transfer, logic, circuit, transistor and process levels. To address this critical area through a common forum, the IEEE Transactions on VLSI Systems have been founded. The editorial board, consisting of international experts, invites original papers which emphasize and merit the novel systems integration aspects of microelectronic systems including interactions among systems design and partitioning, logic and memory design, digital and analog circuit design, layout synthesis, CAD tools, chips and wafer fabrication, testing and packaging, and systems level qualification. Thus, the coverage of these Transactions will focus on VLSI/ULSI microelectronic systems integration.