swun - diff：一种基于扩散模型的单幅散焦图像去模糊网络

IF 4.6 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Complex & Intelligent Systems Pub Date : 2025-02-17 DOI:10.1007/s40747-025-01789-w

Hanyan Liang, Shuyao Chai, Xixuan Zhao, Jiangming Kan

{"title":"swun - diff：一种基于扩散模型的单幅散焦图像去模糊网络","authors":"Hanyan Liang, Shuyao Chai, Xixuan Zhao, Jiangming Kan","doi":"10.1007/s40747-025-01789-w","DOIUrl":null,"url":null,"abstract":"<p>Single Image Defocus Deblurring (SIDD) remains challenging due to spatially varying blur kernels, particularly in processing high-resolution images where traditional methods often struggle with artifact generation, detail preservation, and computational efficiency. This paper presents Swin-Diff, a novel architecture integrating diffusion models with Transformer-based networks for robust defocus deblurring. Our approach employs a two-stage training strategy where a diffusion model generates prior information in a compact latent space, which is then hierarchically fused with intermediate features to guide the regression model. The architecture incorporates a dual-dimensional self-attention mechanism operating across channel and spatial domains, enhancing long-range modeling capabilities while maintaining linear computational complexity. Extensive experiments on three public datasets (DPDD, RealDOF, and RTF) demonstrate Swin-Diff’s superior performance, achieving average improvements of 1.37% in PSNR, 3.6% in SSIM, 2.3% in MAE, and 25.2% in LPIPS metrics compared to state-of-the-art methods. Our results validate the effectiveness of combining diffusion models with hierarchical attention mechanisms for high-quality defocus blur removal.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"15 1","pages":""},"PeriodicalIF":4.6000,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Swin-Diff: a single defocus image deblurring network based on diffusion model\",\"authors\":\"Hanyan Liang, Shuyao Chai, Xixuan Zhao, Jiangming Kan\",\"doi\":\"10.1007/s40747-025-01789-w\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Single Image Defocus Deblurring (SIDD) remains challenging due to spatially varying blur kernels, particularly in processing high-resolution images where traditional methods often struggle with artifact generation, detail preservation, and computational efficiency. This paper presents Swin-Diff, a novel architecture integrating diffusion models with Transformer-based networks for robust defocus deblurring. Our approach employs a two-stage training strategy where a diffusion model generates prior information in a compact latent space, which is then hierarchically fused with intermediate features to guide the regression model. The architecture incorporates a dual-dimensional self-attention mechanism operating across channel and spatial domains, enhancing long-range modeling capabilities while maintaining linear computational complexity. Extensive experiments on three public datasets (DPDD, RealDOF, and RTF) demonstrate Swin-Diff’s superior performance, achieving average improvements of 1.37% in PSNR, 3.6% in SSIM, 2.3% in MAE, and 25.2% in LPIPS metrics compared to state-of-the-art methods. Our results validate the effectiveness of combining diffusion models with hierarchical attention mechanisms for high-quality defocus blur removal.</p>\",\"PeriodicalId\":10524,\"journal\":{\"name\":\"Complex & Intelligent Systems\",\"volume\":\"15 1\",\"pages\":\"\"},\"PeriodicalIF\":4.6000,\"publicationDate\":\"2025-02-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Complex & Intelligent Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s40747-025-01789-w\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Complex & Intelligent Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s40747-025-01789-w","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

单图像离焦去模糊（SIDD）仍然具有挑战性，因为空间上的模糊核是不同的，特别是在处理高分辨率图像时，传统方法经常在伪影生成、细节保存和计算效率方面遇到困难。本文提出了一种将扩散模型与基于变压器的网络相结合的新型体系结构swan - diff，用于鲁棒离焦去模糊。我们的方法采用两阶段训练策略，其中扩散模型在紧凑的潜在空间中生成先验信息，然后与中间特征分层融合以指导回归模型。该架构结合了跨通道和空间域操作的二维自关注机制，在保持线性计算复杂性的同时增强了远程建模能力。在三个公共数据集（DPDD， RealDOF和RTF）上进行的大量实验表明，与最先进的方法相比，swyn - diff的性能优越，在PSNR方面平均提高1.37%，在SSIM方面提高3.6%，在MAE方面提高2.3%，在LPIPS方面提高25.2%。我们的结果验证了将扩散模型与分层注意机制相结合用于高质量散焦模糊去除的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Swin-Diff: a single defocus image deblurring network based on diffusion model

Single Image Defocus Deblurring (SIDD) remains challenging due to spatially varying blur kernels, particularly in processing high-resolution images where traditional methods often struggle with artifact generation, detail preservation, and computational efficiency. This paper presents Swin-Diff, a novel architecture integrating diffusion models with Transformer-based networks for robust defocus deblurring. Our approach employs a two-stage training strategy where a diffusion model generates prior information in a compact latent space, which is then hierarchically fused with intermediate features to guide the regression model. The architecture incorporates a dual-dimensional self-attention mechanism operating across channel and spatial domains, enhancing long-range modeling capabilities while maintaining linear computational complexity. Extensive experiments on three public datasets (DPDD, RealDOF, and RTF) demonstrate Swin-Diff’s superior performance, achieving average improvements of 1.37% in PSNR, 3.6% in SSIM, 2.3% in MAE, and 25.2% in LPIPS metrics compared to state-of-the-art methods. Our results validate the effectiveness of combining diffusion models with hierarchical attention mechanisms for high-quality defocus blur removal.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Complex & Intelligent Systems COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-

CiteScore

9.60

自引率

10.30%

发文量

297

期刊介绍： Complex & Intelligent Systems aims to provide a forum for presenting and discussing novel approaches, tools and techniques meant for attaining a cross-fertilization between the broad fields of complex systems, computational simulation, and intelligent analytics and visualization. The transdisciplinary research that the journal focuses on will expand the boundaries of our understanding by investigating the principles and processes that underlie many of the most profound problems facing society today.