SinDiffusion：从单个自然图像中学习扩散模型

IF 18.6

IEEE transactions on pattern analysis and machine intelligence Pub Date : 2025-01-27 DOI:10.1109/TPAMI.2025.3532956

Weilun Wang;Jianmin Bao;Wengang Zhou;Dongdong Chen;Dong Chen;Lu Yuan;Houqiang Li

{"title":"SinDiffusion：从单个自然图像中学习扩散模型","authors":"Weilun Wang;Jianmin Bao;Wengang Zhou;Dongdong Chen;Dong Chen;Lu Yuan;Houqiang Li","doi":"10.1109/TPAMI.2025.3532956","DOIUrl":null,"url":null,"abstract":"We present SinDiffusion, leveraging denoising diffusion models to capture internal distribution of patches from a single natural image. The default approach of previous GAN-based methods on this problem is to train multiple models at progressive growing scales, which leads to the accumulation of errors and causes characteristic artifacts in generated results. In this paper, we uncover that multiple models at progressive growing scales are not essential for learning from a single image and propose SinDiffusion, a single diffusion-based model trained on a single scale, which is better-suited for this task. Furthermore, we identify that a patch-level receptive field is crucial and effective for diffusion models to capture the image’s patch statistics, therefore we redesign an patch-wise denoising network for SinDiffusion. Coupling these two designs enables SinDiffusion to generate more photorealistic and diverse images from a single image compared with GAN-based approaches. SinDiffusion can also be applied to various applications, i.e., text-guided image generation, and image outpainting beyond the capability of SinGAN. Extensive experiments on a wide range of images demonstrate the superiority of SinDiffusion for modeling the patch distribution.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"47 5","pages":"3412-3423"},"PeriodicalIF":18.6000,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SinDiffusion: Learning a Diffusion Model From a Single Natural Image\",\"authors\":\"Weilun Wang;Jianmin Bao;Wengang Zhou;Dongdong Chen;Dong Chen;Lu Yuan;Houqiang Li\",\"doi\":\"10.1109/TPAMI.2025.3532956\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We present SinDiffusion, leveraging denoising diffusion models to capture internal distribution of patches from a single natural image. The default approach of previous GAN-based methods on this problem is to train multiple models at progressive growing scales, which leads to the accumulation of errors and causes characteristic artifacts in generated results. In this paper, we uncover that multiple models at progressive growing scales are not essential for learning from a single image and propose SinDiffusion, a single diffusion-based model trained on a single scale, which is better-suited for this task. Furthermore, we identify that a patch-level receptive field is crucial and effective for diffusion models to capture the image’s patch statistics, therefore we redesign an patch-wise denoising network for SinDiffusion. Coupling these two designs enables SinDiffusion to generate more photorealistic and diverse images from a single image compared with GAN-based approaches. SinDiffusion can also be applied to various applications, i.e., text-guided image generation, and image outpainting beyond the capability of SinGAN. Extensive experiments on a wide range of images demonstrate the superiority of SinDiffusion for modeling the patch distribution.\",\"PeriodicalId\":94034,\"journal\":{\"name\":\"IEEE transactions on pattern analysis and machine intelligence\",\"volume\":\"47 5\",\"pages\":\"3412-3423\"},\"PeriodicalIF\":18.6000,\"publicationDate\":\"2025-01-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on pattern analysis and machine intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10855351/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10855351/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

我们提出了SinDiffusion，利用去噪扩散模型从单个自然图像中捕获斑块的内部分布。以往基于gan的方法在此问题上的默认方法是在渐进增长的尺度上训练多个模型，这会导致误差的积累，并在生成的结果中产生特征伪影。在本文中，我们发现在渐进增长的尺度上的多个模型对于从单个图像中学习并不是必需的，并提出了SinDiffusion，这是一个在单个尺度上训练的基于单个扩散的模型，它更适合于这项任务。此外，我们发现对于扩散模型捕获图像的斑块统计来说，斑块级接受域是至关重要和有效的，因此我们为SinDiffusion重新设计了一个基于斑块的去噪网络。与基于gan的方法相比，结合这两种设计使SinDiffusion能够从单个图像生成更逼真和多样化的图像。SinDiffusion还可以应用于各种应用，例如文本引导图像生成，以及超出SinGAN能力的图像输出。在大范围的图像上进行的大量实验证明了SinDiffusion在模拟斑块分布方面的优越性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

SinDiffusion: Learning a Diffusion Model From a Single Natural Image

We present SinDiffusion, leveraging denoising diffusion models to capture internal distribution of patches from a single natural image. The default approach of previous GAN-based methods on this problem is to train multiple models at progressive growing scales, which leads to the accumulation of errors and causes characteristic artifacts in generated results. In this paper, we uncover that multiple models at progressive growing scales are not essential for learning from a single image and propose SinDiffusion, a single diffusion-based model trained on a single scale, which is better-suited for this task. Furthermore, we identify that a patch-level receptive field is crucial and effective for diffusion models to capture the image’s patch statistics, therefore we redesign an patch-wise denoising network for SinDiffusion. Coupling these two designs enables SinDiffusion to generate more photorealistic and diverse images from a single image compared with GAN-based approaches. SinDiffusion can also be applied to various applications, i.e., text-guided image generation, and image outpainting beyond the capability of SinGAN. Extensive experiments on a wide range of images demonstrate the superiority of SinDiffusion for modeling the patch distribution.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE transactions on pattern analysis and machine intelligence

自引率

0.00%

发文量