Meng Wang, Zhengnan Li, Haipeng Liu, Zhaoyu Chen, Kewei Cai
{"title":"SP-IGAN:在真实世界图像超分辨率中有效利用语义先验的改进GAN框架。","authors":"Meng Wang, Zhengnan Li, Haipeng Liu, Zhaoyu Chen, Kewei Cai","doi":"10.3390/e27040414","DOIUrl":null,"url":null,"abstract":"<p><p>Single-image super-resolution (SISR) based on GANs has achieved significant progress. However, these methods still face challenges when reconstructing locally consistent textures due to a lack of semantic understanding of image categories. This highlights the necessity of focusing on contextual information comprehension and the acquisition of high-frequency details in model design. To address this issue, we propose the Semantic Prior-Improved GAN (SP-IGAN) framework, which incorporates additional contextual semantic information into the Real-ESRGAN model. The framework consists of two branches. The main branch introduces a Graph Convolutional Channel Attention (GCCA) module to transform channel dependencies into adjacency relationships between feature vertices, thereby enhancing pixel associations. The auxiliary branch strengthens the correlation between semantic category information and regional textures in the Residual-in-Residual Dense Block (RRDB) module. The auxiliary branch employs a pretrained segmentation model to accurately extract regional semantic information from the input low-resolution image. This information is injected into the RRDB module through Spatial Feature Transform (SFT) layers, generating more accurate and semantically consistent texture details. Additionally, a wavelet loss is incorporated into the loss function to capture high-frequency details that are often overlooked. The experimental results demonstrate that the proposed SP-IGAN outperforms state-of-the-art (SOTA) super-resolution models across multiple public datasets. For the X4 super-resolution task, SP-IGAN achieves a 0.55 dB improvement in Peak Signal-to-Noise Ratio (PSNR) and a 0.0363 increase in Structural Similarity Index (SSIM) compared to the baseline model Real-ESRGAN.</p>","PeriodicalId":11694,"journal":{"name":"Entropy","volume":"27 4","pages":""},"PeriodicalIF":2.1000,"publicationDate":"2025-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12025809/pdf/","citationCount":"0","resultStr":"{\"title\":\"SP-IGAN: An Improved GAN Framework for Effective Utilization of Semantic Priors in Real-World Image Super-Resolution.\",\"authors\":\"Meng Wang, Zhengnan Li, Haipeng Liu, Zhaoyu Chen, Kewei Cai\",\"doi\":\"10.3390/e27040414\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Single-image super-resolution (SISR) based on GANs has achieved significant progress. However, these methods still face challenges when reconstructing locally consistent textures due to a lack of semantic understanding of image categories. This highlights the necessity of focusing on contextual information comprehension and the acquisition of high-frequency details in model design. To address this issue, we propose the Semantic Prior-Improved GAN (SP-IGAN) framework, which incorporates additional contextual semantic information into the Real-ESRGAN model. The framework consists of two branches. The main branch introduces a Graph Convolutional Channel Attention (GCCA) module to transform channel dependencies into adjacency relationships between feature vertices, thereby enhancing pixel associations. The auxiliary branch strengthens the correlation between semantic category information and regional textures in the Residual-in-Residual Dense Block (RRDB) module. The auxiliary branch employs a pretrained segmentation model to accurately extract regional semantic information from the input low-resolution image. This information is injected into the RRDB module through Spatial Feature Transform (SFT) layers, generating more accurate and semantically consistent texture details. Additionally, a wavelet loss is incorporated into the loss function to capture high-frequency details that are often overlooked. The experimental results demonstrate that the proposed SP-IGAN outperforms state-of-the-art (SOTA) super-resolution models across multiple public datasets. For the X4 super-resolution task, SP-IGAN achieves a 0.55 dB improvement in Peak Signal-to-Noise Ratio (PSNR) and a 0.0363 increase in Structural Similarity Index (SSIM) compared to the baseline model Real-ESRGAN.</p>\",\"PeriodicalId\":11694,\"journal\":{\"name\":\"Entropy\",\"volume\":\"27 4\",\"pages\":\"\"},\"PeriodicalIF\":2.1000,\"publicationDate\":\"2025-04-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12025809/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Entropy\",\"FirstCategoryId\":\"101\",\"ListUrlMain\":\"https://doi.org/10.3390/e27040414\",\"RegionNum\":3,\"RegionCategory\":\"物理与天体物理\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"PHYSICS, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Entropy","FirstCategoryId":"101","ListUrlMain":"https://doi.org/10.3390/e27040414","RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PHYSICS, MULTIDISCIPLINARY","Score":null,"Total":0}
SP-IGAN: An Improved GAN Framework for Effective Utilization of Semantic Priors in Real-World Image Super-Resolution.
Single-image super-resolution (SISR) based on GANs has achieved significant progress. However, these methods still face challenges when reconstructing locally consistent textures due to a lack of semantic understanding of image categories. This highlights the necessity of focusing on contextual information comprehension and the acquisition of high-frequency details in model design. To address this issue, we propose the Semantic Prior-Improved GAN (SP-IGAN) framework, which incorporates additional contextual semantic information into the Real-ESRGAN model. The framework consists of two branches. The main branch introduces a Graph Convolutional Channel Attention (GCCA) module to transform channel dependencies into adjacency relationships between feature vertices, thereby enhancing pixel associations. The auxiliary branch strengthens the correlation between semantic category information and regional textures in the Residual-in-Residual Dense Block (RRDB) module. The auxiliary branch employs a pretrained segmentation model to accurately extract regional semantic information from the input low-resolution image. This information is injected into the RRDB module through Spatial Feature Transform (SFT) layers, generating more accurate and semantically consistent texture details. Additionally, a wavelet loss is incorporated into the loss function to capture high-frequency details that are often overlooked. The experimental results demonstrate that the proposed SP-IGAN outperforms state-of-the-art (SOTA) super-resolution models across multiple public datasets. For the X4 super-resolution task, SP-IGAN achieves a 0.55 dB improvement in Peak Signal-to-Noise Ratio (PSNR) and a 0.0363 increase in Structural Similarity Index (SSIM) compared to the baseline model Real-ESRGAN.
期刊介绍:
Entropy (ISSN 1099-4300), an international and interdisciplinary journal of entropy and information studies, publishes reviews, regular research papers and short notes. Our aim is to encourage scientists to publish as much as possible their theoretical and experimental details. There is no restriction on the length of the papers. If there are computation and the experiment, the details must be provided so that the results can be reproduced.