A Single Residual Network with ESA Modules and Distillation

Yucong Wang, Minjie Cai
{"title":"A Single Residual Network with ESA Modules and Distillation","authors":"Yucong Wang, Minjie Cai","doi":"10.1109/CVPRW59228.2023.00191","DOIUrl":null,"url":null,"abstract":"Although there are many methods based on deep learning that have superior performance on single image super-resolution (SISR), it is difficult to run in real time on devices with limited computing power. Some recent studies have found that simply relying on reducing parameters or reducing the theoretical FLOPs of the model does not speed up the inference time of the network in a practical sense. Actual speed on the device is probably a better measure than FLOPs. In this work, we propose a new single residual network (SRN). On the one hand, we try to introduce and optimize an attention mechanism module to improve the performance of the network with a relatively small speed loss. On the other hand, we find that residuals in residual blocks do not have a positive impact on networks with adjusted ESA. Therefore, the residual of the network residual block is removed, which not only improves the speed of the network, but also improves the performance of the network. Finally, we reduced the number of channels and the number of residual blocks of the classic model EDSR, and removed the last convolution before the long residual. We set this tuned EDSR as the teacher model and our newly proposed SRN as the student model. Under the joint effect of the original loss and the distillation loss, the performance of the network can be improved without losing the inference time. Combining the above strategies, our proposed model runs much faster than similarly performing models. As an example, we built a Fast and Efficient Network (SRN) and its small version SRN-S, which run 30%-37% faster than the state-of-the-art EISR model: a paper champion RLFN. Furthermore, the shallow version of SRN-S achieves the second-shortest inference time as well as the second-smallest number of activations in the NTIRE2023 challenge. Code will be available at https://github.com/wnxbwyc/SRN.","PeriodicalId":355438,"journal":{"name":"2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CVPRW59228.2023.00191","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Although there are many methods based on deep learning that have superior performance on single image super-resolution (SISR), it is difficult to run in real time on devices with limited computing power. Some recent studies have found that simply relying on reducing parameters or reducing the theoretical FLOPs of the model does not speed up the inference time of the network in a practical sense. Actual speed on the device is probably a better measure than FLOPs. In this work, we propose a new single residual network (SRN). On the one hand, we try to introduce and optimize an attention mechanism module to improve the performance of the network with a relatively small speed loss. On the other hand, we find that residuals in residual blocks do not have a positive impact on networks with adjusted ESA. Therefore, the residual of the network residual block is removed, which not only improves the speed of the network, but also improves the performance of the network. Finally, we reduced the number of channels and the number of residual blocks of the classic model EDSR, and removed the last convolution before the long residual. We set this tuned EDSR as the teacher model and our newly proposed SRN as the student model. Under the joint effect of the original loss and the distillation loss, the performance of the network can be improved without losing the inference time. Combining the above strategies, our proposed model runs much faster than similarly performing models. As an example, we built a Fast and Efficient Network (SRN) and its small version SRN-S, which run 30%-37% faster than the state-of-the-art EISR model: a paper champion RLFN. Furthermore, the shallow version of SRN-S achieves the second-shortest inference time as well as the second-smallest number of activations in the NTIRE2023 challenge. Code will be available at https://github.com/wnxbwyc/SRN.
带有ESA模块和蒸馏的单残留网络
虽然有许多基于深度学习的方法在单图像超分辨率(SISR)上具有优越的性能,但难以在计算能力有限的设备上实时运行。最近的一些研究发现,单纯依靠减小参数或减小模型的理论FLOPs并不能在实际意义上加快网络的推理时间。设备上的实际速度可能是比FLOPs更好的衡量标准。在这项工作中,我们提出了一种新的单残差网络(SRN)。一方面,我们尝试引入和优化一个注意力机制模块,以相对较小的速度损失来提高网络的性能。另一方面,我们发现残差块中的残差对调整后的ESA网络没有积极影响。因此,网络残留块的残留被去除,不仅提高了网络的速度,而且提高了网络的性能。最后,我们减少了经典模型EDSR的通道数和残差块数,去掉了长残差前的最后一个卷积。我们将调整后的EDSR设置为教师模型,并将新提出的SRN设置为学生模型。在原始损失和蒸馏损失的共同作用下,可以在不损失推理时间的情况下提高网络的性能。结合上述策略,我们提出的模型比类似的模型运行得快得多。作为一个例子,我们构建了一个快速高效网络(SRN)和它的小版本SRN- s,其运行速度比最先进的EISR模型快30%-37%:论文冠军RLFN。此外,SRN-S的浅层版本在NTIRE2023挑战中实现了第二短的推理时间和第二小的激活数。代码将在https://github.com/wnxbwyc/SRN上提供。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信