{"title":"EDM:用于复杂场景图像复原的增强扩散模型","authors":"JiaYan Wen, YuanSheng Zhuang, JunYi Deng","doi":"10.1007/s00371-024-03549-2","DOIUrl":null,"url":null,"abstract":"<p>At presently, <i>diffusion model</i> has achieved state-of-the-art performance by modeling the image synthesis process through a series of denoising network applications. Image restoration (IR) is to improve the subjective image quality corrupted by various kinds of degradation unlike image synthesis. However, IR for complex scenes such as worksite images is greatly challenging in the low-level vision field due to complicated environmental factors. To solve this problem, we propose a enhanced diffusion models for image restoration in complex scenes (EDM). It improves the authenticity and representation ability for the generation process, while effectively handles complex backgrounds and diverse object types. EDM has three main contributions: (1) Its framework adopts a Mish-based residual module, which enhances the ability to learn complex patterns of images, and allows for the presence of negative gradients to reduce overfitting risks during model training. (2) It employs a mixed-head self-attention mechanism, which augments the correlation among input elements at each time step, and maintains a better balance between capturing the global structural information and local detailed textures of the image. (3) This study evaluates EDM on a self-built dataset specifically tailored for worksite image restoration, named “Workplace,” and was compared with results from another two public datasets named Places2 and Rain100H. Furthermore, the achievement of experiments on these datasets not only demonstrates EDM’s application value in a specific domain, but also its potential and versatility in broader image restoration tasks. Code, dataset and models are available at: https://github.com/Zhuangvictor0/EDM-A-Enhanced-Diffusion-Models-for-Image-Restoration-in-Complex-Scenes</p>","PeriodicalId":501186,"journal":{"name":"The Visual Computer","volume":"67 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"EDM: a enhanced diffusion models for image restoration in complex scenes\",\"authors\":\"JiaYan Wen, YuanSheng Zhuang, JunYi Deng\",\"doi\":\"10.1007/s00371-024-03549-2\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>At presently, <i>diffusion model</i> has achieved state-of-the-art performance by modeling the image synthesis process through a series of denoising network applications. Image restoration (IR) is to improve the subjective image quality corrupted by various kinds of degradation unlike image synthesis. However, IR for complex scenes such as worksite images is greatly challenging in the low-level vision field due to complicated environmental factors. To solve this problem, we propose a enhanced diffusion models for image restoration in complex scenes (EDM). It improves the authenticity and representation ability for the generation process, while effectively handles complex backgrounds and diverse object types. EDM has three main contributions: (1) Its framework adopts a Mish-based residual module, which enhances the ability to learn complex patterns of images, and allows for the presence of negative gradients to reduce overfitting risks during model training. (2) It employs a mixed-head self-attention mechanism, which augments the correlation among input elements at each time step, and maintains a better balance between capturing the global structural information and local detailed textures of the image. (3) This study evaluates EDM on a self-built dataset specifically tailored for worksite image restoration, named “Workplace,” and was compared with results from another two public datasets named Places2 and Rain100H. Furthermore, the achievement of experiments on these datasets not only demonstrates EDM’s application value in a specific domain, but also its potential and versatility in broader image restoration tasks. Code, dataset and models are available at: https://github.com/Zhuangvictor0/EDM-A-Enhanced-Diffusion-Models-for-Image-Restoration-in-Complex-Scenes</p>\",\"PeriodicalId\":501186,\"journal\":{\"name\":\"The Visual Computer\",\"volume\":\"67 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-07-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"The Visual Computer\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1007/s00371-024-03549-2\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Visual Computer","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s00371-024-03549-2","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
EDM: a enhanced diffusion models for image restoration in complex scenes
At presently, diffusion model has achieved state-of-the-art performance by modeling the image synthesis process through a series of denoising network applications. Image restoration (IR) is to improve the subjective image quality corrupted by various kinds of degradation unlike image synthesis. However, IR for complex scenes such as worksite images is greatly challenging in the low-level vision field due to complicated environmental factors. To solve this problem, we propose a enhanced diffusion models for image restoration in complex scenes (EDM). It improves the authenticity and representation ability for the generation process, while effectively handles complex backgrounds and diverse object types. EDM has three main contributions: (1) Its framework adopts a Mish-based residual module, which enhances the ability to learn complex patterns of images, and allows for the presence of negative gradients to reduce overfitting risks during model training. (2) It employs a mixed-head self-attention mechanism, which augments the correlation among input elements at each time step, and maintains a better balance between capturing the global structural information and local detailed textures of the image. (3) This study evaluates EDM on a self-built dataset specifically tailored for worksite image restoration, named “Workplace,” and was compared with results from another two public datasets named Places2 and Rain100H. Furthermore, the achievement of experiments on these datasets not only demonstrates EDM’s application value in a specific domain, but also its potential and versatility in broader image restoration tasks. Code, dataset and models are available at: https://github.com/Zhuangvictor0/EDM-A-Enhanced-Diffusion-Models-for-Image-Restoration-in-Complex-Scenes