通过扩散模型估算像素级 DensePose 的高逼真度合成数据集

IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Jiaxiao Wen, Tao Chu, Qiong Liu
{"title":"通过扩散模型估算像素级 DensePose 的高逼真度合成数据集","authors":"Jiaxiao Wen,&nbsp;Tao Chu,&nbsp;Qiong Liu","doi":"10.1016/j.patcog.2024.111137","DOIUrl":null,"url":null,"abstract":"<div><div>Generating training data with pixel-level annotations for DensePose is a labor-intensive task, resulting in sparse labeling in real-world datasets. Prior solutions have relied on specialized data generation systems to synthesize datasets. However, these synthetic datasets often lack realism and rely on expensive resources such as human body models and texture mappings. In this paper, we address these challenges by introducing a novel data generation method based on the diffusion model, effectively producing highly realistic data without the need for expensive resources. Specifically, our method comprises annotation generation and image generation. Utilizing graphic renderers and SMPL models, we produce synthetic annotations solely based on human poses and shapes. Subsequently, guided by these annotations, we employ simple yet effective textual prompts to generate a wide range of realistic images using the diffusion model. Our experiments conducted on DensePose-COCO dataset demonstrate the superiority of our method compared to existing methods. Code and benchmarks will be released.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"159 ","pages":"Article 111137"},"PeriodicalIF":7.5000,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Highly realistic synthetic dataset for pixel-level DensePose estimation via diffusion model\",\"authors\":\"Jiaxiao Wen,&nbsp;Tao Chu,&nbsp;Qiong Liu\",\"doi\":\"10.1016/j.patcog.2024.111137\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Generating training data with pixel-level annotations for DensePose is a labor-intensive task, resulting in sparse labeling in real-world datasets. Prior solutions have relied on specialized data generation systems to synthesize datasets. However, these synthetic datasets often lack realism and rely on expensive resources such as human body models and texture mappings. In this paper, we address these challenges by introducing a novel data generation method based on the diffusion model, effectively producing highly realistic data without the need for expensive resources. Specifically, our method comprises annotation generation and image generation. Utilizing graphic renderers and SMPL models, we produce synthetic annotations solely based on human poses and shapes. Subsequently, guided by these annotations, we employ simple yet effective textual prompts to generate a wide range of realistic images using the diffusion model. Our experiments conducted on DensePose-COCO dataset demonstrate the superiority of our method compared to existing methods. Code and benchmarks will be released.</div></div>\",\"PeriodicalId\":49713,\"journal\":{\"name\":\"Pattern Recognition\",\"volume\":\"159 \",\"pages\":\"Article 111137\"},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2024-11-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Pattern Recognition\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0031320324008884\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0031320324008884","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

为 DensePose 生成带有像素级注释的训练数据是一项劳动密集型任务,导致真实世界数据集中的标签稀疏。先前的解决方案依赖于专门的数据生成系统来合成数据集。然而,这些合成数据集往往缺乏真实感,而且依赖于昂贵的资源,如人体模型和纹理映射。在本文中,我们引入了一种基于扩散模型的新型数据生成方法,无需昂贵的资源即可有效生成高度逼真的数据,从而解决了这些难题。具体来说,我们的方法包括注释生成和图像生成。利用图形渲染器和 SMPL 模型,我们仅根据人的姿势和形状生成合成注释。随后,在这些注释的指导下,我们采用简单而有效的文字提示,利用扩散模型生成各种逼真的图像。我们在 DensePose-COCO 数据集上进行的实验证明,与现有方法相比,我们的方法更胜一筹。我们将发布代码和基准。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Highly realistic synthetic dataset for pixel-level DensePose estimation via diffusion model
Generating training data with pixel-level annotations for DensePose is a labor-intensive task, resulting in sparse labeling in real-world datasets. Prior solutions have relied on specialized data generation systems to synthesize datasets. However, these synthetic datasets often lack realism and rely on expensive resources such as human body models and texture mappings. In this paper, we address these challenges by introducing a novel data generation method based on the diffusion model, effectively producing highly realistic data without the need for expensive resources. Specifically, our method comprises annotation generation and image generation. Utilizing graphic renderers and SMPL models, we produce synthetic annotations solely based on human poses and shapes. Subsequently, guided by these annotations, we employ simple yet effective textual prompts to generate a wide range of realistic images using the diffusion model. Our experiments conducted on DensePose-COCO dataset demonstrate the superiority of our method compared to existing methods. Code and benchmarks will be released.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Pattern Recognition
Pattern Recognition 工程技术-工程:电子与电气
CiteScore
14.40
自引率
16.20%
发文量
683
审稿时长
5.6 months
期刊介绍: The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信