Diffusion Models in Low-Level Vision: A Survey

IF 18.6

IEEE transactions on pattern analysis and machine intelligence Pub Date : 2025-02-24 DOI:10.1109/TPAMI.2025.3545047

Chunming He;Yuqi Shen;Chengyu Fang;Fengyang Xiao;Longxiang Tang;Yulun Zhang;Wangmeng Zuo;Zhenhua Guo;Xiu Li

{"title":"Diffusion Models in Low-Level Vision: A Survey","authors":"Chunming He;Yuqi Shen;Chengyu Fang;Fengyang Xiao;Longxiang Tang;Yulun Zhang;Wangmeng Zuo;Zhenhua Guo;Xiu Li","doi":"10.1109/TPAMI.2025.3545047","DOIUrl":null,"url":null,"abstract":"Deep generative models have gained considerable attention in low-level vision tasks due to their powerful generative capabilities. Among these, diffusion model-based approaches, which employ a forward diffusion process to degrade an image and a reverse denoising process for image generation, have become particularly prominent for producing high-quality, diverse samples with intricate texture details. Despite their widespread success in low-level vision, there remains a lack of a comprehensive, insightful survey that synthesizes and organizes the advances in diffusion model-based techniques. To address this gap, this paper presents the first comprehensive review focused on denoising diffusion models applied to low-level vision tasks, covering both theoretical and practical contributions. We outline three general diffusion modeling frameworks and explore their connections with other popular deep generative models, establishing a solid theoretical foundation for subsequent analysis. We then categorize diffusion models used in low-level vision tasks from multiple perspectives, considering both the underlying framework and the target application. Beyond natural image processing, we also summarize diffusion models applied to other low-level vision domains, including medical imaging, remote sensing, and video processing. Additionally, we provide an overview of widely used benchmarks and evaluation metrics in low-level vision tasks. Our review includes an extensive evaluation of diffusion model-based techniques across six representative tasks, with both quantitative and qualitative analysis. Finally, we highlight the limitations of current diffusion models and propose four promising directions for future research. This comprehensive review aims to foster a deeper understanding of the role of denoising diffusion models in low-level vision.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"47 6","pages":"4630-4651"},"PeriodicalIF":18.6000,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10902142","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10902142/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Deep generative models have gained considerable attention in low-level vision tasks due to their powerful generative capabilities. Among these, diffusion model-based approaches, which employ a forward diffusion process to degrade an image and a reverse denoising process for image generation, have become particularly prominent for producing high-quality, diverse samples with intricate texture details. Despite their widespread success in low-level vision, there remains a lack of a comprehensive, insightful survey that synthesizes and organizes the advances in diffusion model-based techniques. To address this gap, this paper presents the first comprehensive review focused on denoising diffusion models applied to low-level vision tasks, covering both theoretical and practical contributions. We outline three general diffusion modeling frameworks and explore their connections with other popular deep generative models, establishing a solid theoretical foundation for subsequent analysis. We then categorize diffusion models used in low-level vision tasks from multiple perspectives, considering both the underlying framework and the target application. Beyond natural image processing, we also summarize diffusion models applied to other low-level vision domains, including medical imaging, remote sensing, and video processing. Additionally, we provide an overview of widely used benchmarks and evaluation metrics in low-level vision tasks. Our review includes an extensive evaluation of diffusion model-based techniques across six representative tasks, with both quantitative and qualitative analysis. Finally, we highlight the limitations of current diffusion models and propose four promising directions for future research. This comprehensive review aims to foster a deeper understanding of the role of denoising diffusion models in low-level vision.

查看原文本刊更多论文

低水平视觉中的扩散模型：综述

深度生成模型由于其强大的生成能力在低层次视觉任务中得到了广泛的关注。其中，基于扩散模型的方法采用前向扩散过程来降解图像，并采用反向去噪过程来生成图像，在生成具有复杂纹理细节的高质量、多样化样本方面变得尤为突出。尽管它们在低水平视觉方面取得了广泛的成功，但仍然缺乏一个综合和组织基于扩散模型的技术进展的全面的、有洞察力的调查。为了解决这一差距，本文首次全面回顾了应用于低水平视觉任务的去噪扩散模型，涵盖了理论和实践贡献。我们概述了三种一般的扩散建模框架，并探讨了它们与其他流行的深度生成模型的联系，为后续分析奠定了坚实的理论基础。然后，我们从多个角度对低级视觉任务中使用的扩散模型进行分类，同时考虑底层框架和目标应用。除了自然图像处理，我们还总结了应用于其他低层次视觉领域的扩散模型，包括医学成像、遥感和视频处理。此外，我们提供了在低级视觉任务中广泛使用的基准和评估指标的概述。我们的回顾包括对六个代表性任务中基于扩散模型的技术的广泛评估，并进行了定量和定性分析。最后，我们强调了现有扩散模型的局限性，并提出了未来研究的四个有希望的方向。本综述旨在加深对去噪扩散模型在低水平视觉中的作用的理解。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE transactions on pattern analysis and machine intelligence

自引率

0.00%

发文量