Multi-Dimensional Dynamic Model Compression for Efficient Image Super-Resolution

2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2022-01-01 DOI:10.1109/WACV51458.2022.00355

Zejiang Hou, S. Kung

{"title":"Multi-Dimensional Dynamic Model Compression for Efficient Image Super-Resolution","authors":"Zejiang Hou, S. Kung","doi":"10.1109/WACV51458.2022.00355","DOIUrl":null,"url":null,"abstract":"Modern single image super-resolution (SR) system based on convolutional neural networks achieves substantial progress. However, most SR deep networks are computationally expensive and require excessively large activation memory footprints, impeding their effective deployment to resource-limited devices. Based on the observation that the activation patterns in SR networks exhibit high input-dependency, we propose Multi-Dimensional Dynamic Model Compression method that can reduce both spatial and channel wise redundancy in an SR deep network for different input images. To reduce the spatial-wise redundancy, we propose to perform convolution on scaled-down feature-maps where the down-scaling factor is made adaptive to different input images. To reduce the channel-wise redundancy, we introduce a low-cost channel saliency predictor for each convolution to dynamically skip the computation of unimportant channels based on the Gumbel-Softmax. To better capture the feature-maps information and facilitate input-adaptive decision, we employ classic image processing metrics, e.g., Spatial Information, to guide the saliency predictors. The proposed method can be readily applied to a variety of SR deep networks and trained end-to-end with standard super-resolution loss, in combination with a sparsity criterion. Experiments on several benchmarks demonstrate that our method can effectively reduce the FLOPs of both lightweight and non-compact SR models with negligible PSNR loss. Moreover, our compressed models achieve competitive PSNR-FLOPs Pareto frontier compared with SOTA NAS-based SR methods.","PeriodicalId":297092,"journal":{"name":"2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WACV51458.2022.00355","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

Modern single image super-resolution (SR) system based on convolutional neural networks achieves substantial progress. However, most SR deep networks are computationally expensive and require excessively large activation memory footprints, impeding their effective deployment to resource-limited devices. Based on the observation that the activation patterns in SR networks exhibit high input-dependency, we propose Multi-Dimensional Dynamic Model Compression method that can reduce both spatial and channel wise redundancy in an SR deep network for different input images. To reduce the spatial-wise redundancy, we propose to perform convolution on scaled-down feature-maps where the down-scaling factor is made adaptive to different input images. To reduce the channel-wise redundancy, we introduce a low-cost channel saliency predictor for each convolution to dynamically skip the computation of unimportant channels based on the Gumbel-Softmax. To better capture the feature-maps information and facilitate input-adaptive decision, we employ classic image processing metrics, e.g., Spatial Information, to guide the saliency predictors. The proposed method can be readily applied to a variety of SR deep networks and trained end-to-end with standard super-resolution loss, in combination with a sparsity criterion. Experiments on several benchmarks demonstrate that our method can effectively reduce the FLOPs of both lightweight and non-compact SR models with negligible PSNR loss. Moreover, our compressed models achieve competitive PSNR-FLOPs Pareto frontier compared with SOTA NAS-based SR methods.

查看原文本刊更多论文

高效图像超分辨率的多维动态模型压缩

现代基于卷积神经网络的单图像超分辨率(SR)系统取得了长足的进步。然而，大多数SR深度网络在计算上是昂贵的，并且需要过大的激活内存占用，阻碍了它们在资源有限的设备上的有效部署。基于观察到SR网络中的激活模式表现出高度的输入依赖性，我们提出了多维动态模型压缩方法，该方法可以减少SR深度网络中不同输入图像的空间和信道冗余。为了减少空间冗余，我们建议对按比例缩小的特征图进行卷积，其中按比例缩小的因子自适应不同的输入图像。为了减少信道冗余，我们为每个卷积引入一个低成本的信道显著性预测器，以动态跳过基于Gumbel-Softmax的不重要信道的计算。为了更好地捕获特征图信息，促进输入自适应决策，我们采用经典的图像处理指标，如空间信息，来指导显著性预测器。结合稀疏度准则，该方法可以很容易地应用于各种SR深度网络，并具有标准超分辨率损失的端到端训练。在多个基准测试上的实验表明，我们的方法可以有效地降低轻量级和非紧凑型SR模型的FLOPs，而PSNR损失可以忽略不计。此外，与基于SOTA nas的SR方法相比，我们的压缩模型实现了具有竞争力的PSNR-FLOPs Pareto边界。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)

自引率

0.00%

发文量