TopoDiff: Training-free image generation with topological layout control

IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Shitong Cao, Xuejie Zhang, Jin Wang, Xiaobing Zhou
{"title":"TopoDiff: Training-free image generation with topological layout control","authors":"Shitong Cao,&nbsp;Xuejie Zhang,&nbsp;Jin Wang,&nbsp;Xiaobing Zhou","doi":"10.1016/j.eswa.2025.128556","DOIUrl":null,"url":null,"abstract":"<div><div>Recent diffusion models can generate high-quality images from text, but their spatial control remains limited. To address this, the goal is to enhance layout control in text-to-image generation without requiring retraining of existing models. Specifically, the proposed TopoDiff framework is a training-free approach that leverages topological guidance to enable precise spatial control during inference. It leaves the original architecture and parameters of Stable Diffusion unmodified. This approach employs a graph-based topological language to explicitly capture object spatial relationships while integrating topological loss into the diffusion model’s denoising process. Additionally, a dynamic offset mechanism is designed to adjust spatial positions during generation, balancing topological structure consistency with the flexibility required for complex generation. Experimental results demonstrate that TopoDiff achieves over 10 % higher Average Precision (AP) than the Stable Diffusion. The source codes are publicly available at <span><span>https://github.com/marcocst/TopoDiff</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"291 ","pages":"Article 128556"},"PeriodicalIF":7.5000,"publicationDate":"2025-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S095741742502175X","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Recent diffusion models can generate high-quality images from text, but their spatial control remains limited. To address this, the goal is to enhance layout control in text-to-image generation without requiring retraining of existing models. Specifically, the proposed TopoDiff framework is a training-free approach that leverages topological guidance to enable precise spatial control during inference. It leaves the original architecture and parameters of Stable Diffusion unmodified. This approach employs a graph-based topological language to explicitly capture object spatial relationships while integrating topological loss into the diffusion model’s denoising process. Additionally, a dynamic offset mechanism is designed to adjust spatial positions during generation, balancing topological structure consistency with the flexibility required for complex generation. Experimental results demonstrate that TopoDiff achieves over 10 % higher Average Precision (AP) than the Stable Diffusion. The source codes are publicly available at https://github.com/marcocst/TopoDiff.
TopoDiff:具有拓扑布局控制的无训练图像生成
最近的扩散模型可以从文本生成高质量的图像,但它们的空间控制仍然有限。为了解决这个问题,目标是在不需要重新训练现有模型的情况下增强文本到图像生成中的布局控制。具体来说,所提出的TopoDiff框架是一种无需训练的方法,它利用拓扑引导在推理过程中实现精确的空间控制。它保持了原有的结构和参数的稳定扩散不变。该方法采用基于图的拓扑语言显式捕获对象空间关系,同时将拓扑损失集成到扩散模型的去噪过程中。此外,设计了动态偏移机制来调整生成过程中的空间位置,平衡拓扑结构的一致性和复杂生成所需的灵活性。实验结果表明,TopoDiff算法的平均精度(AP)比Stable Diffusion算法高10%以上。源代码可在https://github.com/marcocst/TopoDiff上公开获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Expert Systems with Applications
Expert Systems with Applications 工程技术-工程:电子与电气
CiteScore
13.80
自引率
10.60%
发文量
2045
审稿时长
8.7 months
期刊介绍: Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信