InverseMeetInsert:通过引导扩散模型中的几何累积反演进行稳健的真实图像编辑

Yan Zheng, Lemeng Wu
{"title":"InverseMeetInsert:通过引导扩散模型中的几何累积反演进行稳健的真实图像编辑","authors":"Yan Zheng, Lemeng Wu","doi":"arxiv-2409.11734","DOIUrl":null,"url":null,"abstract":"In this paper, we introduce Geometry-Inverse-Meet-Pixel-Insert, short for\nGEO, an exceptionally versatile image editing technique designed to cater to\ncustomized user requirements at both local and global scales. Our approach\nseamlessly integrates text prompts and image prompts to yield diverse and\nprecise editing outcomes. Notably, our method operates without the need for\ntraining and is driven by two key contributions: (i) a novel geometric\naccumulation loss that enhances DDIM inversion to faithfully preserve pixel\nspace geometry and layout, and (ii) an innovative boosted image prompt\ntechnique that combines pixel-level editing for text-only inversion with latent\nspace geometry guidance for standard classifier-free reversion. Leveraging the\npublicly available Stable Diffusion model, our approach undergoes extensive\nevaluation across various image types and challenging prompt editing scenarios,\nconsistently delivering high-fidelity editing results for real images.","PeriodicalId":501130,"journal":{"name":"arXiv - CS - Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"InverseMeetInsert: Robust Real Image Editing via Geometric Accumulation Inversion in Guided Diffusion Models\",\"authors\":\"Yan Zheng, Lemeng Wu\",\"doi\":\"arxiv-2409.11734\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we introduce Geometry-Inverse-Meet-Pixel-Insert, short for\\nGEO, an exceptionally versatile image editing technique designed to cater to\\ncustomized user requirements at both local and global scales. Our approach\\nseamlessly integrates text prompts and image prompts to yield diverse and\\nprecise editing outcomes. Notably, our method operates without the need for\\ntraining and is driven by two key contributions: (i) a novel geometric\\naccumulation loss that enhances DDIM inversion to faithfully preserve pixel\\nspace geometry and layout, and (ii) an innovative boosted image prompt\\ntechnique that combines pixel-level editing for text-only inversion with latent\\nspace geometry guidance for standard classifier-free reversion. Leveraging the\\npublicly available Stable Diffusion model, our approach undergoes extensive\\nevaluation across various image types and challenging prompt editing scenarios,\\nconsistently delivering high-fidelity editing results for real images.\",\"PeriodicalId\":501130,\"journal\":{\"name\":\"arXiv - CS - Computer Vision and Pattern Recognition\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Computer Vision and Pattern Recognition\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.11734\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Computer Vision and Pattern Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11734","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

在本文中,我们介绍了几何反转与像素插入(Geometry-Inverse-Meet-Pixel-Insert,简称GEO),这是一种非常灵活的图像编辑技术,旨在满足用户在局部和全局范围内的个性化需求。我们的方法将文本提示和图像提示完美地结合在一起,从而产生多样化的精确编辑结果。值得注意的是,我们的方法无需训练即可运行,并由两个关键贡献驱动:(i) 一种新颖的几何累积损失,可增强 DDIM 反演以忠实保留像素空间的几何和布局;(ii) 一种创新的增强图像提示技术,可将纯文本反演的像素级编辑与标准无分类器反演的潜空间几何引导相结合。利用公开的稳定扩散模型,我们的方法在各种图像类型和具有挑战性的提示编辑场景中进行了广泛的评估,始终如一地为真实图像提供高保真编辑结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
InverseMeetInsert: Robust Real Image Editing via Geometric Accumulation Inversion in Guided Diffusion Models
In this paper, we introduce Geometry-Inverse-Meet-Pixel-Insert, short for GEO, an exceptionally versatile image editing technique designed to cater to customized user requirements at both local and global scales. Our approach seamlessly integrates text prompts and image prompts to yield diverse and precise editing outcomes. Notably, our method operates without the need for training and is driven by two key contributions: (i) a novel geometric accumulation loss that enhances DDIM inversion to faithfully preserve pixel space geometry and layout, and (ii) an innovative boosted image prompt technique that combines pixel-level editing for text-only inversion with latent space geometry guidance for standard classifier-free reversion. Leveraging the publicly available Stable Diffusion model, our approach undergoes extensive evaluation across various image types and challenging prompt editing scenarios, consistently delivering high-fidelity editing results for real images.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信