F-MDM：基于特征映射的泊松-高斯混合扩散模型图像去噪的再思考

IF 3.1 4区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Journal of Visual Communication and Image Representation Pub Date : 2025-09-25 DOI:10.1016/j.jvcir.2025.104593

Bin Wang, Jiajia Hu, Fengyuan Zuo, Junfei Shi, Haiyan Jin

{"title":"F-MDM：基于特征映射的泊松-高斯混合扩散模型图像去噪的再思考","authors":"Bin Wang, Jiajia Hu, Fengyuan Zuo, Junfei Shi, Haiyan Jin","doi":"10.1016/j.jvcir.2025.104593","DOIUrl":null,"url":null,"abstract":"<div><div>In image-denoising tasks, the diffusion model has shown great potential. Usually, the diffusion model uses a real scene’s noise-free and clean image dataset as the starting point for diffusion. When the denoising network trained on this dataset is applied to image denoising in other scenes, the generalization of the denoising network will decrease due to changes in scene priors. In order to improve generalization, we hope to find a clean image dataset that not only has rich scene priors but also has a certain scene independence. The VGG-16 network is a network trained from a large number of images. After the real scene images are processed through the VGG-16 convolution layer, the shallow feature maps obtained have scene priors and break free from the scene dependency caused by minor details. This paper uses the shallow feature maps of VGG-16 as a clean image dataset for the diffusion model, and the results of denoising experiments are surprising. Furthermore, considering that the noise of the image mainly includes Gaussian noise and Poisson noise, the classical diffusion model uses Gaussian noise for diffusion to improve the interpretability of the model. We introduce a novel Poisson–Gaussian noise mixture for the diffusion process, and the theoretical derivation is given. Finally, we propose a Poisson–Gaussian Denoising <strong>M</strong>ixture <strong>D</strong>iffusion <strong>M</strong>odel based on <strong>F</strong>eature maps (<strong>F-MDM</strong>). Experiments demonstrate that our method exhibits excellent generalization ability compared to some other advanced algorithms.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"112 ","pages":"Article 104593"},"PeriodicalIF":3.1000,"publicationDate":"2025-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"F-MDM: Rethinking image denoising with a feature map-based Poisson–Gaussian Mixture Diffusion Model\",\"authors\":\"Bin Wang, Jiajia Hu, Fengyuan Zuo, Junfei Shi, Haiyan Jin\",\"doi\":\"10.1016/j.jvcir.2025.104593\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>In image-denoising tasks, the diffusion model has shown great potential. Usually, the diffusion model uses a real scene’s noise-free and clean image dataset as the starting point for diffusion. When the denoising network trained on this dataset is applied to image denoising in other scenes, the generalization of the denoising network will decrease due to changes in scene priors. In order to improve generalization, we hope to find a clean image dataset that not only has rich scene priors but also has a certain scene independence. The VGG-16 network is a network trained from a large number of images. After the real scene images are processed through the VGG-16 convolution layer, the shallow feature maps obtained have scene priors and break free from the scene dependency caused by minor details. This paper uses the shallow feature maps of VGG-16 as a clean image dataset for the diffusion model, and the results of denoising experiments are surprising. Furthermore, considering that the noise of the image mainly includes Gaussian noise and Poisson noise, the classical diffusion model uses Gaussian noise for diffusion to improve the interpretability of the model. We introduce a novel Poisson–Gaussian noise mixture for the diffusion process, and the theoretical derivation is given. Finally, we propose a Poisson–Gaussian Denoising <strong>M</strong>ixture <strong>D</strong>iffusion <strong>M</strong>odel based on <strong>F</strong>eature maps (<strong>F-MDM</strong>). Experiments demonstrate that our method exhibits excellent generalization ability compared to some other advanced algorithms.</div></div>\",\"PeriodicalId\":54755,\"journal\":{\"name\":\"Journal of Visual Communication and Image Representation\",\"volume\":\"112 \",\"pages\":\"Article 104593\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2025-09-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Visual Communication and Image Representation\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S104732032500207X\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Visual Communication and Image Representation","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S104732032500207X","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

在图像去噪任务中，扩散模型显示出很大的潜力。通常，扩散模型使用真实场景的无噪声和干净的图像数据集作为扩散的起点。当将在该数据集上训练的去噪网络应用于其他场景的图像去噪时，由于场景先验的变化，去噪网络的泛化能力会降低。为了提高泛化，我们希望找到一个干净的图像数据集，它既具有丰富的场景先验，又具有一定的场景独立性。VGG-16网络是由大量图像训练而成的网络。真实场景图像经过VGG-16卷积层处理后，得到的浅层特征图具有场景先验，摆脱了小细节对场景的依赖。本文采用VGG-16的浅层特征图作为扩散模型的干净图像数据集，去噪实验结果令人惊讶。此外，考虑到图像的噪声主要包括高斯噪声和泊松噪声，经典扩散模型采用高斯噪声进行扩散，提高了模型的可解释性。在扩散过程中引入了一种新的泊松-高斯混合噪声，并给出了理论推导。最后，我们提出了一种基于特征映射（F-MDM）的泊松-高斯去噪混合扩散模型。实验表明，与其他先进的算法相比，我们的方法具有出色的泛化能力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

F-MDM: Rethinking image denoising with a feature map-based Poisson–Gaussian Mixture Diffusion Model

In image-denoising tasks, the diffusion model has shown great potential. Usually, the diffusion model uses a real scene’s noise-free and clean image dataset as the starting point for diffusion. When the denoising network trained on this dataset is applied to image denoising in other scenes, the generalization of the denoising network will decrease due to changes in scene priors. In order to improve generalization, we hope to find a clean image dataset that not only has rich scene priors but also has a certain scene independence. The VGG-16 network is a network trained from a large number of images. After the real scene images are processed through the VGG-16 convolution layer, the shallow feature maps obtained have scene priors and break free from the scene dependency caused by minor details. This paper uses the shallow feature maps of VGG-16 as a clean image dataset for the diffusion model, and the results of denoising experiments are surprising. Furthermore, considering that the noise of the image mainly includes Gaussian noise and Poisson noise, the classical diffusion model uses Gaussian noise for diffusion to improve the interpretability of the model. We introduce a novel Poisson–Gaussian noise mixture for the diffusion process, and the theoretical derivation is given. Finally, we propose a Poisson–Gaussian Denoising Mixture Diffusion Model based on Feature maps (F-MDM). Experiments demonstrate that our method exhibits excellent generalization ability compared to some other advanced algorithms.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Visual Communication and Image Representation 工程技术-计算机：软件工程

CiteScore

5.40

自引率

11.50%

发文量

188

审稿时长

9.9 months

期刊介绍： The Journal of Visual Communication and Image Representation publishes papers on state-of-the-art visual communication and image representation, with emphasis on novel technologies and theoretical work in this multidisciplinary area of pure and applied research. The field of visual communication and image representation is considered in its broadest sense and covers both digital and analog aspects as well as processing and communication in biological visual systems.