我应该渲染还是AI生成？合成语义分割数据集与控制生成。

IF 1.7 4区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

IEEE Computer Graphics and Applications Pub Date : 2025-03-01 DOI:10.1109/MCG.2025.3553494

Omar A Mures, Manuel Silva, Manuel Lijo-Sanchez, Emilio J Padron, Jose A Iglesias-Guitian

{"title":"我应该渲染还是AI生成？合成语义分割数据集与控制生成。","authors":"Omar A Mures, Manuel Silva, Manuel Lijo-Sanchez, Emilio J Padron, Jose A Iglesias-Guitian","doi":"10.1109/MCG.2025.3553494","DOIUrl":null,"url":null,"abstract":"This work explores the integration of generative AI models for automatically generating synthetic image-labeled data. Our approach leverages controllable diffusion models to generate synthetic variations of semantically labeled images. Synthetic datasets for semantic segmentation struggle to represent real-world subtleties, such as different weather conditions or fine details, typically relying on costly simulations and rendering. However, diffusion models can generate diverse images using input text prompts and guidance images, such as semantic masks. Our work introduces and tests a novel methodology for generating labeled synthetic images, with an initial focus on semantic segmentation, a demanding computer vision task. We showcase our approach in two distinct image segmentation domains, outperforming traditional computer graphics simulations in efficiently creating diverse datasets and training downstream models. We leverage generative models for crafting synthetically labeled images, posing the question: \"Should I render or should AI generate?\" Our results endorse a paradigm shift toward controlled generation models.","PeriodicalId":55026,"journal":{"name":"IEEE Computer Graphics and Applications","volume":"PP ","pages":"57-68"},"PeriodicalIF":1.7000,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Should I Render or Should AI Generate? Crafting Synthetic Semantic Segmentation Datasets With Controlled Generation.\",\"authors\":\"Omar A Mures, Manuel Silva, Manuel Lijo-Sanchez, Emilio J Padron, Jose A Iglesias-Guitian\",\"doi\":\"10.1109/MCG.2025.3553494\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This work explores the integration of generative AI models for automatically generating synthetic image-labeled data. Our approach leverages controllable diffusion models to generate synthetic variations of semantically labeled images. Synthetic datasets for semantic segmentation struggle to represent real-world subtleties, such as different weather conditions or fine details, typically relying on costly simulations and rendering. However, diffusion models can generate diverse images using input text prompts and guidance images, such as semantic masks. Our work introduces and tests a novel methodology for generating labeled synthetic images, with an initial focus on semantic segmentation, a demanding computer vision task. We showcase our approach in two distinct image segmentation domains, outperforming traditional computer graphics simulations in efficiently creating diverse datasets and training downstream models. We leverage generative models for crafting synthetically labeled images, posing the question: \\\"Should I render or should AI generate?\\\" Our results endorse a paradigm shift toward controlled generation models.\",\"PeriodicalId\":55026,\"journal\":{\"name\":\"IEEE Computer Graphics and Applications\",\"volume\":\"PP \",\"pages\":\"57-68\"},\"PeriodicalIF\":1.7000,\"publicationDate\":\"2025-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Computer Graphics and Applications\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1109/MCG.2025.3553494\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Computer Graphics and Applications","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1109/MCG.2025.3553494","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

摘要

这项工作探讨了如何整合生成式人工智能模型，以自动生成合成图像标签数据。我们的方法利用可控扩散模型生成语义标签图像的合成变化。用于语义分割的合成数据集很难表现真实世界的微妙之处，如不同的天气条件或精细细节，通常依赖于昂贵的模拟和渲染。然而，扩散模型可以利用输入文本提示和引导图像（如语义掩码）生成不同的图像。我们的工作介绍并测试了一种生成带标签合成图像的新方法，最初的重点是语义分割，这是一项要求很高的计算机视觉任务。我们在两个不同的图像分割领域展示了我们的方法，在高效创建各种数据集和训练下游模型方面，我们的方法优于传统的计算机图形模拟。我们利用生成模型制作合成标签图像，提出了一个问题："我应该渲染还是人工智能应该生成？我们的研究结果支持向受控生成模型的范式转变。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Should I Render or Should AI Generate? Crafting Synthetic Semantic Segmentation Datasets With Controlled Generation.

This work explores the integration of generative AI models for automatically generating synthetic image-labeled data. Our approach leverages controllable diffusion models to generate synthetic variations of semantically labeled images. Synthetic datasets for semantic segmentation struggle to represent real-world subtleties, such as different weather conditions or fine details, typically relying on costly simulations and rendering. However, diffusion models can generate diverse images using input text prompts and guidance images, such as semantic masks. Our work introduces and tests a novel methodology for generating labeled synthetic images, with an initial focus on semantic segmentation, a demanding computer vision task. We showcase our approach in two distinct image segmentation domains, outperforming traditional computer graphics simulations in efficiently creating diverse datasets and training downstream models. We leverage generative models for crafting synthetically labeled images, posing the question: "Should I render or should AI generate?" Our results endorse a paradigm shift toward controlled generation models.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Computer Graphics and Applications 工程技术-计算机：软件工程

CiteScore

3.20

自引率

5.60%

发文量

160

审稿时长

>12 weeks

期刊介绍： IEEE Computer Graphics and Applications (CG&A) bridges the theory and practice of computer graphics, visualization, virtual and augmented reality, and HCI. From specific algorithms to full system implementations, CG&A offers a unique combination of peer-reviewed feature articles and informal departments. Theme issues guest edited by leading researchers in their fields track the latest developments and trends in computer-generated graphical content, while tutorials and surveys provide a broad overview of interesting and timely topics. Regular departments further explore the core areas of graphics as well as extend into topics such as usability, education, history, and opinion. Each issue, the story of our cover focuses on creative applications of the technology by an artist or designer. Published six times a year, CG&A is indispensable reading for people working at the leading edge of computer-generated graphics technology and its applications in everything from business to the arts.