FreeEnhance：通过内容一致的噪声和去噪过程实现无调谐图像增强

arXiv - CS - Multimedia Pub Date : 2024-09-11 DOI:arxiv-2409.07451

Yang Luo, Yiheng Zhang, Zhaofan Qiu, Ting Yao, Zhineng Chen, Yu-Gang Jiang, Tao Mei

{"title":"FreeEnhance：通过内容一致的噪声和去噪过程实现无调谐图像增强","authors":"Yang Luo, Yiheng Zhang, Zhaofan Qiu, Ting Yao, Zhineng Chen, Yu-Gang Jiang, Tao Mei","doi":"arxiv-2409.07451","DOIUrl":null,"url":null,"abstract":"The emergence of text-to-image generation models has led to the recognition\nthat image enhancement, performed as post-processing, would significantly\nimprove the visual quality of the generated images. Exploring diffusion models\nto enhance the generated images nevertheless is not trivial and necessitates to\ndelicately enrich plentiful details while preserving the visual appearance of\nkey content in the original image. In this paper, we propose a novel framework,\nnamely FreeEnhance, for content-consistent image enhancement using the\noff-the-shelf image diffusion models. Technically, FreeEnhance is a two-stage\nprocess that firstly adds random noise to the input image and then capitalizes\non a pre-trained image diffusion model (i.e., Latent Diffusion Models) to\ndenoise and enhance the image details. In the noising stage, FreeEnhance is\ndevised to add lighter noise to the region with higher frequency to preserve\nthe high-frequent patterns (e.g., edge, corner) in the original image. In the\ndenoising stage, we present three target properties as constraints to\nregularize the predicted noise, enhancing images with high acutance and high\nvisual quality. Extensive experiments conducted on the HPDv2 dataset\ndemonstrate that our FreeEnhance outperforms the state-of-the-art image\nenhancement models in terms of quantitative metrics and human preference. More\nremarkably, FreeEnhance also shows higher human preference compared to the\ncommercial image enhancement solution of Magnific AI.","PeriodicalId":501480,"journal":{"name":"arXiv - CS - Multimedia","volume":"13 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"FreeEnhance: Tuning-Free Image Enhancement via Content-Consistent Noising-and-Denoising Process\",\"authors\":\"Yang Luo, Yiheng Zhang, Zhaofan Qiu, Ting Yao, Zhineng Chen, Yu-Gang Jiang, Tao Mei\",\"doi\":\"arxiv-2409.07451\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The emergence of text-to-image generation models has led to the recognition\\nthat image enhancement, performed as post-processing, would significantly\\nimprove the visual quality of the generated images. Exploring diffusion models\\nto enhance the generated images nevertheless is not trivial and necessitates to\\ndelicately enrich plentiful details while preserving the visual appearance of\\nkey content in the original image. In this paper, we propose a novel framework,\\nnamely FreeEnhance, for content-consistent image enhancement using the\\noff-the-shelf image diffusion models. Technically, FreeEnhance is a two-stage\\nprocess that firstly adds random noise to the input image and then capitalizes\\non a pre-trained image diffusion model (i.e., Latent Diffusion Models) to\\ndenoise and enhance the image details. In the noising stage, FreeEnhance is\\ndevised to add lighter noise to the region with higher frequency to preserve\\nthe high-frequent patterns (e.g., edge, corner) in the original image. In the\\ndenoising stage, we present three target properties as constraints to\\nregularize the predicted noise, enhancing images with high acutance and high\\nvisual quality. Extensive experiments conducted on the HPDv2 dataset\\ndemonstrate that our FreeEnhance outperforms the state-of-the-art image\\nenhancement models in terms of quantitative metrics and human preference. More\\nremarkably, FreeEnhance also shows higher human preference compared to the\\ncommercial image enhancement solution of Magnific AI.\",\"PeriodicalId\":501480,\"journal\":{\"name\":\"arXiv - CS - Multimedia\",\"volume\":\"13 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Multimedia\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.07451\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Multimedia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.07451","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

文本到图像生成模型的出现使人们认识到，图像增强作为后处理将显著提高生成图像的视觉质量。然而，探索扩散模型来增强生成的图像并非易事，它需要在保留原始图像中关键内容的视觉外观的同时，巧妙地丰富大量细节。在本文中，我们提出了一个新颖的框架，即 FreeEnhance，用于使用现成的图像扩散模型进行内容一致的图像增强。从技术上讲，FreeEnhance 是一个两阶段的过程，首先在输入图像中添加随机噪声，然后利用预先训练好的图像扩散模型（即潜在扩散模型）来噪声化和增强图像细节。在噪点处理阶段，FreeEnhance 被设计为在频率较高的区域添加较轻的噪点，以保留原始图像中的高频模式（如边缘、角落）。在去噪阶段，我们将三个目标属性作为约束条件，对预测噪声进行规范化处理，从而增强图像的高敏锐度和高视觉质量。在 HPDv2 数据集上进行的大量实验表明，FreeEnhance 在定量指标和人类偏好方面都优于最先进的图像增强模型。更值得注意的是，与 Magnific AI 的商业图像增强解决方案相比，FreeEnhance 还显示出更高的人类偏好度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

FreeEnhance: Tuning-Free Image Enhancement via Content-Consistent Noising-and-Denoising Process

The emergence of text-to-image generation models has led to the recognition that image enhancement, performed as post-processing, would significantly improve the visual quality of the generated images. Exploring diffusion models to enhance the generated images nevertheless is not trivial and necessitates to delicately enrich plentiful details while preserving the visual appearance of key content in the original image. In this paper, we propose a novel framework, namely FreeEnhance, for content-consistent image enhancement using the off-the-shelf image diffusion models. Technically, FreeEnhance is a two-stage process that firstly adds random noise to the input image and then capitalizes on a pre-trained image diffusion model (i.e., Latent Diffusion Models) to denoise and enhance the image details. In the noising stage, FreeEnhance is devised to add lighter noise to the region with higher frequency to preserve the high-frequent patterns (e.g., edge, corner) in the original image. In the denoising stage, we present three target properties as constraints to regularize the predicted noise, enhancing images with high acutance and high visual quality. Extensive experiments conducted on the HPDv2 dataset demonstrate that our FreeEnhance outperforms the state-of-the-art image enhancement models in terms of quantitative metrics and human preference. More remarkably, FreeEnhance also shows higher human preference compared to the commercial image enhancement solution of Magnific AI.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

arXiv - CS - Multimedia

自引率

0.00%

发文量