频率感知深度网络和小波域单幅图像超分辨率的逐块生成对抗训练

IF 6.6 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Applied Soft Computing Pub Date : 2025-10-09 DOI:10.1016/j.asoc.2025.114035

Masuma Aktar , Kuldeep Singh Yadav , Rabul Hussain Laskar

{"title":"频率感知深度网络和小波域单幅图像超分辨率的逐块生成对抗训练","authors":"Masuma Aktar , Kuldeep Singh Yadav , Rabul Hussain Laskar","doi":"10.1016/j.asoc.2025.114035","DOIUrl":null,"url":null,"abstract":"<div><div>With the increasing demand for high-resolution displays, image super-resolution has become essential for enhancing visual quality, especially in the era of 4K and 8K devices. Single image super-resolution (SISR) plays a vital role in numerous computer vision applications because of its ability to enhance image resolution while preserving important details. Recently, deep learning-based super-resolution methods, including methods based on generative adversarial networks (GANs) and transformers, have become mainstream. However, the existing methods face the challenge of preserving high-frequency details while maintaining overall image quality. They often struggle to reconstruct realistic and undistorted high-frequency information. To address this challenge, we propose the frequency-aware perceptual wavelet domain super-resolution (FA-PWSR) model. FA-PWSR deals with images’ low-frequency and high-frequency components separately and in parallel. FA-PWSR employs a divide-and-conquer strategy, utilizing stationary wavelet transform (SWT) to decompose images into low and high-frequency sub-images. Specialized subnetworks are designed to process each component, ensuring targeted optimization for different frequency bands. The reconstructed sub-images are subsequently integrated using the inverse stationary wavelet transform (ISWT) to generate the final SR image. FA-PWSR achieves superior performance over state-of-the-art methods in both fidelity (PSNR and SSIM) and perceptual metrics (PI and LPIPS) across various datasets including Set5, Set14, Urban100, and BSD100. Even in the challenging Urban100 dataset, FA-PWSR achieves a 1.754 dB improvement in PSNR and a 0.049 increase in SSIM compared to ESRGAN. Furthermore, our model significantly enhances perceptual quality, achieving a 12 % reduction in LPIPS on the same dataset. Moreover, visual comparisons confirm that FA-PWSR effectively preserves fine details, enhances edges, and noticeably reduces artifacts, leading to more realistic and high-fidelity super-resolved images.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"185 ","pages":"Article 114035"},"PeriodicalIF":6.6000,"publicationDate":"2025-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Frequency-aware deep networks and patch-wise generative adversarial training for single image super-resolution in wavelet domain\",\"authors\":\"Masuma Aktar , Kuldeep Singh Yadav , Rabul Hussain Laskar\",\"doi\":\"10.1016/j.asoc.2025.114035\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>With the increasing demand for high-resolution displays, image super-resolution has become essential for enhancing visual quality, especially in the era of 4K and 8K devices. Single image super-resolution (SISR) plays a vital role in numerous computer vision applications because of its ability to enhance image resolution while preserving important details. Recently, deep learning-based super-resolution methods, including methods based on generative adversarial networks (GANs) and transformers, have become mainstream. However, the existing methods face the challenge of preserving high-frequency details while maintaining overall image quality. They often struggle to reconstruct realistic and undistorted high-frequency information. To address this challenge, we propose the frequency-aware perceptual wavelet domain super-resolution (FA-PWSR) model. FA-PWSR deals with images’ low-frequency and high-frequency components separately and in parallel. FA-PWSR employs a divide-and-conquer strategy, utilizing stationary wavelet transform (SWT) to decompose images into low and high-frequency sub-images. Specialized subnetworks are designed to process each component, ensuring targeted optimization for different frequency bands. The reconstructed sub-images are subsequently integrated using the inverse stationary wavelet transform (ISWT) to generate the final SR image. FA-PWSR achieves superior performance over state-of-the-art methods in both fidelity (PSNR and SSIM) and perceptual metrics (PI and LPIPS) across various datasets including Set5, Set14, Urban100, and BSD100. Even in the challenging Urban100 dataset, FA-PWSR achieves a 1.754 dB improvement in PSNR and a 0.049 increase in SSIM compared to ESRGAN. Furthermore, our model significantly enhances perceptual quality, achieving a 12 % reduction in LPIPS on the same dataset. Moreover, visual comparisons confirm that FA-PWSR effectively preserves fine details, enhances edges, and noticeably reduces artifacts, leading to more realistic and high-fidelity super-resolved images.</div></div>\",\"PeriodicalId\":50737,\"journal\":{\"name\":\"Applied Soft Computing\",\"volume\":\"185 \",\"pages\":\"Article 114035\"},\"PeriodicalIF\":6.6000,\"publicationDate\":\"2025-10-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Soft Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1568494625013481\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Soft Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1568494625013481","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

随着对高分辨率显示器的需求不断增加，图像超分辨率已成为提高视觉质量的必要条件，特别是在4K和8K设备时代。单图像超分辨率（SISR）在许多计算机视觉应用中发挥着至关重要的作用，因为它能够在保留重要细节的同时提高图像分辨率。近年来，基于深度学习的超分辨率方法，包括基于生成对抗网络（gan）和变压器的方法，已经成为主流。然而，现有的方法面临着在保持整体图像质量的同时保留高频细节的挑战。他们常常难以重建真实的、未失真的高频信息。为了解决这一挑战，我们提出了频率感知感知小波域超分辨率（FA-PWSR）模型。FA-PWSR分别和并行处理图像的低频和高频分量。FA-PWSR采用分而治之的策略，利用平稳小波变换（SWT）将图像分解为低频和高频子图像。设计专门的子网来处理每个组件，确保针对不同频段进行有针对性的优化。重建的子图像随后使用逆平稳小波变换（ISWT）进行积分以生成最终的SR图像。FA-PWSR在包括Set5， Set14， Urban100和BSD100在内的各种数据集上，在保真度（PSNR和SSIM）和感知度量（PI和LPIPS）方面都比最先进的方法具有优越的性能。即使在具有挑战性的Urban100数据集中，与ESRGAN相比，FA-PWSR的PSNR提高了1.754 dB， SSIM提高了0.049。此外，我们的模型显著提高了感知质量，在相同的数据集上实现了12%的LPIPS降低。此外，视觉比较证实FA-PWSR有效地保留了精细的细节，增强了边缘，并显着减少了伪影，从而产生了更逼真和高保真的超分辨率图像。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Frequency-aware deep networks and patch-wise generative adversarial training for single image super-resolution in wavelet domain

With the increasing demand for high-resolution displays, image super-resolution has become essential for enhancing visual quality, especially in the era of 4K and 8K devices. Single image super-resolution (SISR) plays a vital role in numerous computer vision applications because of its ability to enhance image resolution while preserving important details. Recently, deep learning-based super-resolution methods, including methods based on generative adversarial networks (GANs) and transformers, have become mainstream. However, the existing methods face the challenge of preserving high-frequency details while maintaining overall image quality. They often struggle to reconstruct realistic and undistorted high-frequency information. To address this challenge, we propose the frequency-aware perceptual wavelet domain super-resolution (FA-PWSR) model. FA-PWSR deals with images’ low-frequency and high-frequency components separately and in parallel. FA-PWSR employs a divide-and-conquer strategy, utilizing stationary wavelet transform (SWT) to decompose images into low and high-frequency sub-images. Specialized subnetworks are designed to process each component, ensuring targeted optimization for different frequency bands. The reconstructed sub-images are subsequently integrated using the inverse stationary wavelet transform (ISWT) to generate the final SR image. FA-PWSR achieves superior performance over state-of-the-art methods in both fidelity (PSNR and SSIM) and perceptual metrics (PI and LPIPS) across various datasets including Set5, Set14, Urban100, and BSD100. Even in the challenging Urban100 dataset, FA-PWSR achieves a 1.754 dB improvement in PSNR and a 0.049 increase in SSIM compared to ESRGAN. Furthermore, our model significantly enhances perceptual quality, achieving a 12 % reduction in LPIPS on the same dataset. Moreover, visual comparisons confirm that FA-PWSR effectively preserves fine details, enhances edges, and noticeably reduces artifacts, leading to more realistic and high-fidelity super-resolved images.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Applied Soft Computing 工程技术-计算机：跨学科应用

CiteScore

15.80

自引率

6.90%

发文量

874

审稿时长

10.9 months

期刊介绍： Applied Soft Computing is an international journal promoting an integrated view of soft computing to solve real life problems.The focus is to publish the highest quality research in application and convergence of the areas of Fuzzy Logic, Neural Networks, Evolutionary Computing, Rough Sets and other similar techniques to address real world complexities. Applied Soft Computing is a rolling publication: articles are published as soon as the editor-in-chief has accepted them. Therefore, the web site will continuously be updated with new articles and the publication time will be short.