SinDiffusion:从单个自然图像中学习扩散模型

IF 18.6
Weilun Wang;Jianmin Bao;Wengang Zhou;Dongdong Chen;Dong Chen;Lu Yuan;Houqiang Li
{"title":"SinDiffusion:从单个自然图像中学习扩散模型","authors":"Weilun Wang;Jianmin Bao;Wengang Zhou;Dongdong Chen;Dong Chen;Lu Yuan;Houqiang Li","doi":"10.1109/TPAMI.2025.3532956","DOIUrl":null,"url":null,"abstract":"We present SinDiffusion, leveraging denoising diffusion models to capture internal distribution of patches from a single natural image. The default approach of previous GAN-based methods on this problem is to train multiple models at progressive growing scales, which leads to the accumulation of errors and causes characteristic artifacts in generated results. In this paper, we uncover that multiple models at progressive growing scales are not essential for learning from a single image and propose SinDiffusion, a single diffusion-based model trained on a single scale, which is better-suited for this task. Furthermore, we identify that a patch-level receptive field is crucial and effective for diffusion models to capture the image’s patch statistics, therefore we redesign an patch-wise denoising network for SinDiffusion. Coupling these two designs enables SinDiffusion to generate more photorealistic and diverse images from a single image compared with GAN-based approaches. SinDiffusion can also be applied to various applications, i.e., text-guided image generation, and image outpainting beyond the capability of SinGAN. Extensive experiments on a wide range of images demonstrate the superiority of SinDiffusion for modeling the patch distribution.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"47 5","pages":"3412-3423"},"PeriodicalIF":18.6000,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SinDiffusion: Learning a Diffusion Model From a Single Natural Image\",\"authors\":\"Weilun Wang;Jianmin Bao;Wengang Zhou;Dongdong Chen;Dong Chen;Lu Yuan;Houqiang Li\",\"doi\":\"10.1109/TPAMI.2025.3532956\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We present SinDiffusion, leveraging denoising diffusion models to capture internal distribution of patches from a single natural image. The default approach of previous GAN-based methods on this problem is to train multiple models at progressive growing scales, which leads to the accumulation of errors and causes characteristic artifacts in generated results. In this paper, we uncover that multiple models at progressive growing scales are not essential for learning from a single image and propose SinDiffusion, a single diffusion-based model trained on a single scale, which is better-suited for this task. Furthermore, we identify that a patch-level receptive field is crucial and effective for diffusion models to capture the image’s patch statistics, therefore we redesign an patch-wise denoising network for SinDiffusion. Coupling these two designs enables SinDiffusion to generate more photorealistic and diverse images from a single image compared with GAN-based approaches. SinDiffusion can also be applied to various applications, i.e., text-guided image generation, and image outpainting beyond the capability of SinGAN. Extensive experiments on a wide range of images demonstrate the superiority of SinDiffusion for modeling the patch distribution.\",\"PeriodicalId\":94034,\"journal\":{\"name\":\"IEEE transactions on pattern analysis and machine intelligence\",\"volume\":\"47 5\",\"pages\":\"3412-3423\"},\"PeriodicalIF\":18.6000,\"publicationDate\":\"2025-01-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on pattern analysis and machine intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10855351/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10855351/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

我们提出了SinDiffusion,利用去噪扩散模型从单个自然图像中捕获斑块的内部分布。以往基于gan的方法在此问题上的默认方法是在渐进增长的尺度上训练多个模型,这会导致误差的积累,并在生成的结果中产生特征伪影。在本文中,我们发现在渐进增长的尺度上的多个模型对于从单个图像中学习并不是必需的,并提出了SinDiffusion,这是一个在单个尺度上训练的基于单个扩散的模型,它更适合于这项任务。此外,我们发现对于扩散模型捕获图像的斑块统计来说,斑块级接受域是至关重要和有效的,因此我们为SinDiffusion重新设计了一个基于斑块的去噪网络。与基于gan的方法相比,结合这两种设计使SinDiffusion能够从单个图像生成更逼真和多样化的图像。SinDiffusion还可以应用于各种应用,例如文本引导图像生成,以及超出SinGAN能力的图像输出。在大范围的图像上进行的大量实验证明了SinDiffusion在模拟斑块分布方面的优越性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
SinDiffusion: Learning a Diffusion Model From a Single Natural Image
We present SinDiffusion, leveraging denoising diffusion models to capture internal distribution of patches from a single natural image. The default approach of previous GAN-based methods on this problem is to train multiple models at progressive growing scales, which leads to the accumulation of errors and causes characteristic artifacts in generated results. In this paper, we uncover that multiple models at progressive growing scales are not essential for learning from a single image and propose SinDiffusion, a single diffusion-based model trained on a single scale, which is better-suited for this task. Furthermore, we identify that a patch-level receptive field is crucial and effective for diffusion models to capture the image’s patch statistics, therefore we redesign an patch-wise denoising network for SinDiffusion. Coupling these two designs enables SinDiffusion to generate more photorealistic and diverse images from a single image compared with GAN-based approaches. SinDiffusion can also be applied to various applications, i.e., text-guided image generation, and image outpainting beyond the capability of SinGAN. Extensive experiments on a wide range of images demonstrate the superiority of SinDiffusion for modeling the patch distribution.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信