基于流量的多尺度学习网络，用于单图像随机超分辨率

IF 3.4 3区工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

Signal Processing-Image Communication Pub Date : 2024-04-24 DOI:10.1016/j.image.2024.117132

Qianyu Wu , Zhongqian Hu , Aichun Zhu , Hui Tang , Jiaxin Zou , Yan Xi , Yang Chen

{"title":"基于流量的多尺度学习网络，用于单图像随机超分辨率","authors":"Qianyu Wu , Zhongqian Hu , Aichun Zhu , Hui Tang , Jiaxin Zou , Yan Xi , Yang Chen","doi":"10.1016/j.image.2024.117132","DOIUrl":null,"url":null,"abstract":"<div><p>Single image super-resolution (SISR) is still an important while challenging task. Existing methods usually ignore the diversity of generated Super-Resolution (SR) images. The fine details of the corresponding high-resolution (HR) images cannot be confidently recovered due to the degradation of detail in low-resolution (LR) images. To address the above issue, this paper presents a flow-based multi-scale learning network (FMLnet) to explore the diverse mapping spaces for SR. First, we propose a multi-scale learning block (MLB) to extract the underlying features of the LR image. Second, the introduced pixel-wise multi-head attention allows our model to map multiple representation subspaces simultaneously. Third, by employing a normalizing flow module for a given LR input, our approach generates various stochastic SR outputs with high visual quality. The trade-off between fidelity and perceptual quality can be controlled. Finally, the experimental results on five datasets demonstrate that the proposed network outperforms the existing methods in terms of diversity, and achieves competitive PSNR/SSIM results. Code is available at <span>https://github.com/qianyuwu/FMLnet</span><svg><path></path></svg>.</p></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"125 ","pages":"Article 117132"},"PeriodicalIF":3.4000,"publicationDate":"2024-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A flow-based multi-scale learning network for single image stochastic super-resolution\",\"authors\":\"Qianyu Wu , Zhongqian Hu , Aichun Zhu , Hui Tang , Jiaxin Zou , Yan Xi , Yang Chen\",\"doi\":\"10.1016/j.image.2024.117132\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Single image super-resolution (SISR) is still an important while challenging task. Existing methods usually ignore the diversity of generated Super-Resolution (SR) images. The fine details of the corresponding high-resolution (HR) images cannot be confidently recovered due to the degradation of detail in low-resolution (LR) images. To address the above issue, this paper presents a flow-based multi-scale learning network (FMLnet) to explore the diverse mapping spaces for SR. First, we propose a multi-scale learning block (MLB) to extract the underlying features of the LR image. Second, the introduced pixel-wise multi-head attention allows our model to map multiple representation subspaces simultaneously. Third, by employing a normalizing flow module for a given LR input, our approach generates various stochastic SR outputs with high visual quality. The trade-off between fidelity and perceptual quality can be controlled. Finally, the experimental results on five datasets demonstrate that the proposed network outperforms the existing methods in terms of diversity, and achieves competitive PSNR/SSIM results. Code is available at <span>https://github.com/qianyuwu/FMLnet</span><svg><path></path></svg>.</p></div>\",\"PeriodicalId\":49521,\"journal\":{\"name\":\"Signal Processing-Image Communication\",\"volume\":\"125 \",\"pages\":\"Article 117132\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2024-04-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Signal Processing-Image Communication\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S092359652400033X\",\"RegionNum\":3,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Signal Processing-Image Communication","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S092359652400033X","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

摘要

单幅图像超分辨率（SISR）仍然是一项重要而又具有挑战性的任务。现有方法通常会忽略生成的超分辨率（SR）图像的多样性。由于低分辨率（LR）图像的细节退化，相应的高分辨率（HR）图像的精细细节无法可靠地恢复。为解决上述问题，本文提出了一种基于流的多尺度学习网络（FMLnet）来探索 SR 的不同映射空间。首先，我们提出了一个多尺度学习块（MLB）来提取 LR 图像的底层特征。其次，引入的像素多头注意力使我们的模型能够同时映射多个表示子空间。第三，通过对给定的 LR 输入采用归一化流模块，我们的方法可以生成各种具有高视觉质量的随机 SR 输出。保真度和感知质量之间的权衡是可以控制的。最后，在五个数据集上的实验结果表明，所提出的网络在多样性方面优于现有方法，并取得了具有竞争力的 PSNR/SSIM 结果。代码见 https://github.com/qianyuwu/FMLnet。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A flow-based multi-scale learning network for single image stochastic super-resolution

Single image super-resolution (SISR) is still an important while challenging task. Existing methods usually ignore the diversity of generated Super-Resolution (SR) images. The fine details of the corresponding high-resolution (HR) images cannot be confidently recovered due to the degradation of detail in low-resolution (LR) images. To address the above issue, this paper presents a flow-based multi-scale learning network (FMLnet) to explore the diverse mapping spaces for SR. First, we propose a multi-scale learning block (MLB) to extract the underlying features of the LR image. Second, the introduced pixel-wise multi-head attention allows our model to map multiple representation subspaces simultaneously. Third, by employing a normalizing flow module for a given LR input, our approach generates various stochastic SR outputs with high visual quality. The trade-off between fidelity and perceptual quality can be controlled. Finally, the experimental results on five datasets demonstrate that the proposed network outperforms the existing methods in terms of diversity, and achieves competitive PSNR/SSIM results. Code is available at https://github.com/qianyuwu/FMLnet.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Signal Processing-Image Communication 工程技术-工程：电子与电气

CiteScore

8.40

自引率

2.90%

发文量

138

审稿时长

5.2 months

期刊介绍： Signal Processing: Image Communication is an international journal for the development of the theory and practice of image communication. Its primary objectives are the following: To present a forum for the advancement of theory and practice of image communication. To stimulate cross-fertilization between areas similar in nature which have traditionally been separated, for example, various aspects of visual communications and information systems. To contribute to a rapid information exchange between the industrial and academic environments. The editorial policy and the technical content of the journal are the responsibility of the Editor-in-Chief, the Area Editors and the Advisory Editors. The Journal is self-supporting from subscription income and contains a minimum amount of advertisements. Advertisements are subject to the prior approval of the Editor-in-Chief. The journal welcomes contributions from every country in the world. Signal Processing: Image Communication publishes articles relating to aspects of the design, implementation and use of image communication systems. The journal features original research work, tutorial and review articles, and accounts of practical developments. Subjects of interest include image/video coding, 3D video representations and compression, 3D graphics and animation compression, HDTV and 3DTV systems, video adaptation, video over IP, peer-to-peer video networking, interactive visual communication, multi-user video conferencing, wireless video broadcasting and communication, visual surveillance, 2D and 3D image/video quality measures, pre/post processing, video restoration and super-resolution, multi-camera video analysis, motion analysis, content-based image/video indexing and retrieval, face and gesture processing, video synthesis, 2D and 3D image/video acquisition and display technologies, architectures for image/video processing and communication.