TopoDiff: Training-free image generation with topological layout control

IF 7.5 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Expert Systems with Applications Pub Date : 2025-06-11 DOI:10.1016/j.eswa.2025.128556

Shitong Cao, Xuejie Zhang, Jin Wang, Xiaobing Zhou

{"title":"TopoDiff: Training-free image generation with topological layout control","authors":"Shitong Cao, Xuejie Zhang, Jin Wang, Xiaobing Zhou","doi":"10.1016/j.eswa.2025.128556","DOIUrl":null,"url":null,"abstract":"<div><div>Recent diffusion models can generate high-quality images from text, but their spatial control remains limited. To address this, the goal is to enhance layout control in text-to-image generation without requiring retraining of existing models. Specifically, the proposed TopoDiff framework is a training-free approach that leverages topological guidance to enable precise spatial control during inference. It leaves the original architecture and parameters of Stable Diffusion unmodified. This approach employs a graph-based topological language to explicitly capture object spatial relationships while integrating topological loss into the diffusion model’s denoising process. Additionally, a dynamic offset mechanism is designed to adjust spatial positions during generation, balancing topological structure consistency with the flexibility required for complex generation. Experimental results demonstrate that TopoDiff achieves over 10 % higher Average Precision (AP) than the Stable Diffusion. The source codes are publicly available at <span><span>https://github.com/marcocst/TopoDiff</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"291 ","pages":"Article 128556"},"PeriodicalIF":7.5000,"publicationDate":"2025-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S095741742502175X","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Recent diffusion models can generate high-quality images from text, but their spatial control remains limited. To address this, the goal is to enhance layout control in text-to-image generation without requiring retraining of existing models. Specifically, the proposed TopoDiff framework is a training-free approach that leverages topological guidance to enable precise spatial control during inference. It leaves the original architecture and parameters of Stable Diffusion unmodified. This approach employs a graph-based topological language to explicitly capture object spatial relationships while integrating topological loss into the diffusion model’s denoising process. Additionally, a dynamic offset mechanism is designed to adjust spatial positions during generation, balancing topological structure consistency with the flexibility required for complex generation. Experimental results demonstrate that TopoDiff achieves over 10 % higher Average Precision (AP) than the Stable Diffusion. The source codes are publicly available at https://github.com/marcocst/TopoDiff.

查看原文本刊更多论文

TopoDiff：具有拓扑布局控制的无训练图像生成

最近的扩散模型可以从文本生成高质量的图像，但它们的空间控制仍然有限。为了解决这个问题，目标是在不需要重新训练现有模型的情况下增强文本到图像生成中的布局控制。具体来说，所提出的TopoDiff框架是一种无需训练的方法，它利用拓扑引导在推理过程中实现精确的空间控制。它保持了原有的结构和参数的稳定扩散不变。该方法采用基于图的拓扑语言显式捕获对象空间关系，同时将拓扑损失集成到扩散模型的去噪过程中。此外，设计了动态偏移机制来调整生成过程中的空间位置，平衡拓扑结构的一致性和复杂生成所需的灵活性。实验结果表明，TopoDiff算法的平均精度（AP）比Stable Diffusion算法高10%以上。源代码可在https://github.com/marcocst/TopoDiff上公开获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Expert Systems with Applications 工程技术-工程：电子与电气

CiteScore

13.80

自引率

10.60%

发文量

2045

审稿时长

8.7 months

期刊介绍： Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.