EDM: a enhanced diffusion models for image restoration in complex scenes

JiaYan Wen, YuanSheng Zhuang, JunYi Deng
{"title":"EDM: a enhanced diffusion models for image restoration in complex scenes","authors":"JiaYan Wen, YuanSheng Zhuang, JunYi Deng","doi":"10.1007/s00371-024-03549-2","DOIUrl":null,"url":null,"abstract":"<p>At presently, <i>diffusion model</i> has achieved state-of-the-art performance by modeling the image synthesis process through a series of denoising network applications. Image restoration (IR) is to improve the subjective image quality corrupted by various kinds of degradation unlike image synthesis. However, IR for complex scenes such as worksite images is greatly challenging in the low-level vision field due to complicated environmental factors. To solve this problem, we propose a enhanced diffusion models for image restoration in complex scenes (EDM). It improves the authenticity and representation ability for the generation process, while effectively handles complex backgrounds and diverse object types. EDM has three main contributions: (1) Its framework adopts a Mish-based residual module, which enhances the ability to learn complex patterns of images, and allows for the presence of negative gradients to reduce overfitting risks during model training. (2) It employs a mixed-head self-attention mechanism, which augments the correlation among input elements at each time step, and maintains a better balance between capturing the global structural information and local detailed textures of the image. (3) This study evaluates EDM on a self-built dataset specifically tailored for worksite image restoration, named “Workplace,” and was compared with results from another two public datasets named Places2 and Rain100H. Furthermore, the achievement of experiments on these datasets not only demonstrates EDM’s application value in a specific domain, but also its potential and versatility in broader image restoration tasks. Code, dataset and models are available at: https://github.com/Zhuangvictor0/EDM-A-Enhanced-Diffusion-Models-for-Image-Restoration-in-Complex-Scenes</p>","PeriodicalId":501186,"journal":{"name":"The Visual Computer","volume":"67 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Visual Computer","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s00371-024-03549-2","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

At presently, diffusion model has achieved state-of-the-art performance by modeling the image synthesis process through a series of denoising network applications. Image restoration (IR) is to improve the subjective image quality corrupted by various kinds of degradation unlike image synthesis. However, IR for complex scenes such as worksite images is greatly challenging in the low-level vision field due to complicated environmental factors. To solve this problem, we propose a enhanced diffusion models for image restoration in complex scenes (EDM). It improves the authenticity and representation ability for the generation process, while effectively handles complex backgrounds and diverse object types. EDM has three main contributions: (1) Its framework adopts a Mish-based residual module, which enhances the ability to learn complex patterns of images, and allows for the presence of negative gradients to reduce overfitting risks during model training. (2) It employs a mixed-head self-attention mechanism, which augments the correlation among input elements at each time step, and maintains a better balance between capturing the global structural information and local detailed textures of the image. (3) This study evaluates EDM on a self-built dataset specifically tailored for worksite image restoration, named “Workplace,” and was compared with results from another two public datasets named Places2 and Rain100H. Furthermore, the achievement of experiments on these datasets not only demonstrates EDM’s application value in a specific domain, but also its potential and versatility in broader image restoration tasks. Code, dataset and models are available at: https://github.com/Zhuangvictor0/EDM-A-Enhanced-Diffusion-Models-for-Image-Restoration-in-Complex-Scenes

Abstract Image

EDM:用于复杂场景图像复原的增强扩散模型
目前,扩散模型通过一系列去噪网络应用对图像合成过程进行建模,取得了最先进的性能。与图像合成不同,图像复原(IR)是为了改善被各种退化破坏的主观图像质量。然而,由于复杂的环境因素,在低层次视觉领域,复杂场景(如工地图像)的 IR 具有极大的挑战性。为了解决这个问题,我们提出了一种用于复杂场景图像复原的增强扩散模型(EDM)。它提高了生成过程的真实性和表示能力,同时还能有效处理复杂背景和不同物体类型。EDM 有三大贡献:(1)其框架采用了基于 Mish 的残差模块,增强了学习复杂图像模式的能力,并允许负梯度的存在,以降低模型训练过程中的过拟合风险。(2)它采用了混合头自关注机制,在每个时间步增强了输入元素之间的相关性,在捕捉图像的整体结构信息和局部细节纹理之间保持了较好的平衡。(3) 本研究在自建的专门用于工地图像修复的数据集 "Workplace "上对 EDM 进行了评估,并将其与另外两个公共数据集 "Places2 "和 "Rain100H "的结果进行了比较。此外,在这些数据集上的实验结果不仅证明了 EDM 在特定领域的应用价值,还证明了它在更广泛的图像修复任务中的潜力和多功能性。代码、数据集和模型可在以下网址获取: https://github.com/Zhuangvictor0/EDM-A-Enhanced-Diffusion-Models-for-Image-Restoration-in-Complex-Scenes
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信