{"title":"A Unified Accelerator for All-in-One Image Restoration Based on Prompt Degradation Learning","authors":"Siyu Zhang;Qiwei Dong;Wendong Mao;Zhongfeng Wang","doi":"10.1109/TCSI.2024.3519532","DOIUrl":null,"url":null,"abstract":"All-in-one image restoration (IR) recovers images from various unknown distortions by a single model, such as rain, haze, and blur. Transformer-based IR methods have significantly improved the visual effects of the restored images. However, deploying complex IR models on edge devices is challenging due to massive parameters and intensive computations. Moreover, existing accelerators are typically customized for a single task, resulting in severe resource underutilization when executing multiple tasks. Therefore, this paper develops an algorithm-hardware co-design framework to accelerate a novel CNN-Transformer cooperative model for multiple IR tasks. Firstly, on the algorithm level, an Efficient Restoration Foundational Model (ERFM) is proposed to recover corrupted images from various degradations with low model complexity. Secondly, to guide adaptive corruption removal, a novel prompt learning scheme is introduced to fuse context-related degradation cues and boost high-quality reconstruction. Thirdly, on the hardware level, an integer approximation method is proposed to avoid expensive hardware overhead caused by complex nonlinear operations, such as layer normalization and softmax while maintaining comparable IR quality. Moreover, a head stationary dataflow and softmax fusion mechanism are designed to reduce data movement and enhance on-chip resource utilization. Finally, an overall hardware architecture is developed and implemented in TSMC 28 nm CMOS technology. Experimental results show that our ERFM achieves better visual perception than other baselines on seven challenging IR tasks without task-specific fine-tuning. Moreover, compared to other accelerators for vision Transformers, our design can achieve <inline-formula> <tex-math>$3.3\\times $ </tex-math></inline-formula> and <inline-formula> <tex-math>$3.7\\times $ </tex-math></inline-formula> improvements in throughput and energy efficiency.","PeriodicalId":13039,"journal":{"name":"IEEE Transactions on Circuits and Systems I: Regular Papers","volume":"72 3","pages":"1282-1295"},"PeriodicalIF":5.2000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Circuits and Systems I: Regular Papers","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10819973/","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
All-in-one image restoration (IR) recovers images from various unknown distortions by a single model, such as rain, haze, and blur. Transformer-based IR methods have significantly improved the visual effects of the restored images. However, deploying complex IR models on edge devices is challenging due to massive parameters and intensive computations. Moreover, existing accelerators are typically customized for a single task, resulting in severe resource underutilization when executing multiple tasks. Therefore, this paper develops an algorithm-hardware co-design framework to accelerate a novel CNN-Transformer cooperative model for multiple IR tasks. Firstly, on the algorithm level, an Efficient Restoration Foundational Model (ERFM) is proposed to recover corrupted images from various degradations with low model complexity. Secondly, to guide adaptive corruption removal, a novel prompt learning scheme is introduced to fuse context-related degradation cues and boost high-quality reconstruction. Thirdly, on the hardware level, an integer approximation method is proposed to avoid expensive hardware overhead caused by complex nonlinear operations, such as layer normalization and softmax while maintaining comparable IR quality. Moreover, a head stationary dataflow and softmax fusion mechanism are designed to reduce data movement and enhance on-chip resource utilization. Finally, an overall hardware architecture is developed and implemented in TSMC 28 nm CMOS technology. Experimental results show that our ERFM achieves better visual perception than other baselines on seven challenging IR tasks without task-specific fine-tuning. Moreover, compared to other accelerators for vision Transformers, our design can achieve $3.3\times $ and $3.7\times $ improvements in throughput and energy efficiency.
期刊介绍:
TCAS I publishes regular papers in the field specified by the theory, analysis, design, and practical implementations of circuits, and the application of circuit techniques to systems and to signal processing. Included is the whole spectrum from basic scientific theory to industrial applications. The field of interest covered includes: - Circuits: Analog, Digital and Mixed Signal Circuits and Systems - Nonlinear Circuits and Systems, Integrated Sensors, MEMS and Systems on Chip, Nanoscale Circuits and Systems, Optoelectronic - Circuits and Systems, Power Electronics and Systems - Software for Analog-and-Logic Circuits and Systems - Control aspects of Circuits and Systems.