基于多级先验制导扩散的遥感图像超分辨率

IF 12.2 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL
Lijing Lu , Zhou Huang , Yi Bao , Lin Wan , Zhihang Li
{"title":"基于多级先验制导扩散的遥感图像超分辨率","authors":"Lijing Lu ,&nbsp;Zhou Huang ,&nbsp;Yi Bao ,&nbsp;Lin Wan ,&nbsp;Zhihang Li","doi":"10.1016/j.isprsjprs.2025.07.020","DOIUrl":null,"url":null,"abstract":"<div><div>Recently, diffusion models have achieved advancements in natural image super-resolution (SR) tasks, overcoming some issues posed by traditional approaches, e.g., performance limitations in CNN-based and Transformer-based approaches, as well as instable training and mode collapse in GAN. However, despite these advancements, existing diffusion-based SR methods fail to perform well for remote sensing images. Current diffusion-based super-resolution techniques face two key challenges: (1) A jeopardy to the generative prior arises due to the necessity of training from scratch, which can lead to suboptimal performance. (2) A loss of fidelity occurs due to the limited priors in SR models, which only take the low-resolution image as input. To deal with these challenges, we introduce a Multi-level Priors-Guided Diffusion-based Remote Sensing Image Super-Resolution Model (DLMSR) approach. In particular, we utilize a pre-trained stable diffusion model to maintain the generative prior captured in synthesis models, resulting in more stable and detailed outcomes. Furthermore, to establish comprehensive priors, we incorporate multimodal large language models (MLLMs) to capture diverse priors such as texture and content priors. Additionally, we introduce category priors by employing a category classifier to offer global and concise signals for precise reconstruction. Then, we devise a cascade prior fusion module and a class-aware encoder to integrate rich priors into the diffusion model. DLMSR is extensively evaluated on four publicly available remote sensing datasets, including AID, DOTA, DIOR, and NWPU-RESISC45, demonstrating consistent advantages over representative state-of-the-art methods. In particular, compared with StableSR, DLMSR achieves an average increase of 0.29 dB in PSNR and a decrease of 1.93 in FID across three simulated benchmarks, indicating enhanced reconstruction fidelity and perceptual quality. The source code and dataset links are publicly available at: <span><span>https://github.com/lijing28/DLMSR.git</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"228 ","pages":"Pages 756-770"},"PeriodicalIF":12.2000,"publicationDate":"2025-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multi-level Priors-Guided Diffusion-based Remote Sensing Image Super-Resolution\",\"authors\":\"Lijing Lu ,&nbsp;Zhou Huang ,&nbsp;Yi Bao ,&nbsp;Lin Wan ,&nbsp;Zhihang Li\",\"doi\":\"10.1016/j.isprsjprs.2025.07.020\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Recently, diffusion models have achieved advancements in natural image super-resolution (SR) tasks, overcoming some issues posed by traditional approaches, e.g., performance limitations in CNN-based and Transformer-based approaches, as well as instable training and mode collapse in GAN. However, despite these advancements, existing diffusion-based SR methods fail to perform well for remote sensing images. Current diffusion-based super-resolution techniques face two key challenges: (1) A jeopardy to the generative prior arises due to the necessity of training from scratch, which can lead to suboptimal performance. (2) A loss of fidelity occurs due to the limited priors in SR models, which only take the low-resolution image as input. To deal with these challenges, we introduce a Multi-level Priors-Guided Diffusion-based Remote Sensing Image Super-Resolution Model (DLMSR) approach. In particular, we utilize a pre-trained stable diffusion model to maintain the generative prior captured in synthesis models, resulting in more stable and detailed outcomes. Furthermore, to establish comprehensive priors, we incorporate multimodal large language models (MLLMs) to capture diverse priors such as texture and content priors. Additionally, we introduce category priors by employing a category classifier to offer global and concise signals for precise reconstruction. Then, we devise a cascade prior fusion module and a class-aware encoder to integrate rich priors into the diffusion model. DLMSR is extensively evaluated on four publicly available remote sensing datasets, including AID, DOTA, DIOR, and NWPU-RESISC45, demonstrating consistent advantages over representative state-of-the-art methods. In particular, compared with StableSR, DLMSR achieves an average increase of 0.29 dB in PSNR and a decrease of 1.93 in FID across three simulated benchmarks, indicating enhanced reconstruction fidelity and perceptual quality. The source code and dataset links are publicly available at: <span><span>https://github.com/lijing28/DLMSR.git</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":50269,\"journal\":{\"name\":\"ISPRS Journal of Photogrammetry and Remote Sensing\",\"volume\":\"228 \",\"pages\":\"Pages 756-770\"},\"PeriodicalIF\":12.2000,\"publicationDate\":\"2025-08-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ISPRS Journal of Photogrammetry and Remote Sensing\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0924271625002825\",\"RegionNum\":1,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"GEOGRAPHY, PHYSICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ISPRS Journal of Photogrammetry and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0924271625002825","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GEOGRAPHY, PHYSICAL","Score":null,"Total":0}
引用次数: 0

摘要

近年来,扩散模型在自然图像超分辨率(SR)任务中取得了进展,克服了传统方法存在的一些问题,例如基于cnn和transformer的方法的性能限制,以及GAN的不稳定训练和模式崩溃。然而,尽管有这些进步,现有的基于扩散的SR方法在遥感图像上表现不佳。当前基于扩散的超分辨技术面临两个关键挑战:(1)由于需要从头开始训练,生成先验存在危险,这可能导致性能次优。(2) SR模型仅以低分辨率图像作为输入,先验有限,导致保真度下降。为了应对这些挑战,我们提出了一种基于多级先验制导扩散的遥感图像超分辨率模型(DLMSR)方法。特别是,我们利用预训练的稳定扩散模型来维持合成模型中捕获的生成先验,从而产生更稳定和详细的结果。此外,为了建立全面的先验,我们结合了多模态大语言模型(mllm)来捕获不同的先验,如纹理和内容先验。此外,我们通过使用分类器引入类别先验,为精确重建提供全局和简洁的信号。然后,我们设计了级联先验融合模块和类感知编码器,将丰富的先验融合到扩散模型中。DLMSR在四个公开可用的遥感数据集上进行了广泛的评估,包括AID、DOTA、DIOR和NWPU-RESISC45,显示出与具有代表性的最先进方法相比的一致优势。特别是,与StableSR相比,DLMSR在三个模拟基准上的PSNR平均提高了0.29 dB, FID平均降低了1.93,表明重建保真度和感知质量得到了提高。源代码和数据集链接可在:https://github.com/lijing28/DLMSR.git公开获取。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Multi-level Priors-Guided Diffusion-based Remote Sensing Image Super-Resolution
Recently, diffusion models have achieved advancements in natural image super-resolution (SR) tasks, overcoming some issues posed by traditional approaches, e.g., performance limitations in CNN-based and Transformer-based approaches, as well as instable training and mode collapse in GAN. However, despite these advancements, existing diffusion-based SR methods fail to perform well for remote sensing images. Current diffusion-based super-resolution techniques face two key challenges: (1) A jeopardy to the generative prior arises due to the necessity of training from scratch, which can lead to suboptimal performance. (2) A loss of fidelity occurs due to the limited priors in SR models, which only take the low-resolution image as input. To deal with these challenges, we introduce a Multi-level Priors-Guided Diffusion-based Remote Sensing Image Super-Resolution Model (DLMSR) approach. In particular, we utilize a pre-trained stable diffusion model to maintain the generative prior captured in synthesis models, resulting in more stable and detailed outcomes. Furthermore, to establish comprehensive priors, we incorporate multimodal large language models (MLLMs) to capture diverse priors such as texture and content priors. Additionally, we introduce category priors by employing a category classifier to offer global and concise signals for precise reconstruction. Then, we devise a cascade prior fusion module and a class-aware encoder to integrate rich priors into the diffusion model. DLMSR is extensively evaluated on four publicly available remote sensing datasets, including AID, DOTA, DIOR, and NWPU-RESISC45, demonstrating consistent advantages over representative state-of-the-art methods. In particular, compared with StableSR, DLMSR achieves an average increase of 0.29 dB in PSNR and a decrease of 1.93 in FID across three simulated benchmarks, indicating enhanced reconstruction fidelity and perceptual quality. The source code and dataset links are publicly available at: https://github.com/lijing28/DLMSR.git.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
ISPRS Journal of Photogrammetry and Remote Sensing
ISPRS Journal of Photogrammetry and Remote Sensing 工程技术-成像科学与照相技术
CiteScore
21.00
自引率
6.30%
发文量
273
审稿时长
40 days
期刊介绍: The ISPRS Journal of Photogrammetry and Remote Sensing (P&RS) serves as the official journal of the International Society for Photogrammetry and Remote Sensing (ISPRS). It acts as a platform for scientists and professionals worldwide who are involved in various disciplines that utilize photogrammetry, remote sensing, spatial information systems, computer vision, and related fields. The journal aims to facilitate communication and dissemination of advancements in these disciplines, while also acting as a comprehensive source of reference and archive. P&RS endeavors to publish high-quality, peer-reviewed research papers that are preferably original and have not been published before. These papers can cover scientific/research, technological development, or application/practical aspects. Additionally, the journal welcomes papers that are based on presentations from ISPRS meetings, as long as they are considered significant contributions to the aforementioned fields. In particular, P&RS encourages the submission of papers that are of broad scientific interest, showcase innovative applications (especially in emerging fields), have an interdisciplinary focus, discuss topics that have received limited attention in P&RS or related journals, or explore new directions in scientific or professional realms. It is preferred that theoretical papers include practical applications, while papers focusing on systems and applications should include a theoretical background.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信