Automatic Virtual 3D City Generation for Synthetic Data Collection

Bingyu Shen, Boyang Li, W. Scheirer
{"title":"Automatic Virtual 3D City Generation for Synthetic Data Collection","authors":"Bingyu Shen, Boyang Li, W. Scheirer","doi":"10.1109/WACVW52041.2021.00022","DOIUrl":null,"url":null,"abstract":"Computer vision has achieved superior results with the rapid development of new techniques in deep neural networks. Object detection in the wild is a core task in computer vision, and already has many successful applications in the real world. However, deep neural networks for object detection usually consist of hundreds, and sometimes even thousands, of layers. Training such networks is challenging, and training data has a fundamental impact on model performance. Because data collection and annotation are expensive and labor-intensive, lots of data augmentation methods have been proposed to generate synthetic data for neural network training. Most of those methods focus on manipulating 2D images. In contrast to that, in this paper, we leverage the realistic visual effects of 3D environments and propose a new way of generating synthetic data for computer vision tasks related to city scenes. Specifically, we describe a pipeline that can generate a 3D city model from an input of a 2D image that portrays the layout design of a city. This pipeline also takes optional parameters to further customize the output 3D city model. Using our pipeline, a virtual 3D city model with high-quality textures can be generated within seconds, and the output is an object ready to render. The model generated will assist people with limited 3D development knowledge to create high quality city scenes for different needs. As examples, we show the use of generated 3D city models as the synthetic data source for a scene text detection task and a traffic sign detection task. Both qualitative and quantitative results show that the generated virtual city is a good match to real-world data and potentially can benefit other computer vision tasks with similar contexts.","PeriodicalId":313062,"journal":{"name":"2021 IEEE Winter Conference on Applications of Computer Vision Workshops (WACVW)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE Winter Conference on Applications of Computer Vision Workshops (WACVW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WACVW52041.2021.00022","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

Computer vision has achieved superior results with the rapid development of new techniques in deep neural networks. Object detection in the wild is a core task in computer vision, and already has many successful applications in the real world. However, deep neural networks for object detection usually consist of hundreds, and sometimes even thousands, of layers. Training such networks is challenging, and training data has a fundamental impact on model performance. Because data collection and annotation are expensive and labor-intensive, lots of data augmentation methods have been proposed to generate synthetic data for neural network training. Most of those methods focus on manipulating 2D images. In contrast to that, in this paper, we leverage the realistic visual effects of 3D environments and propose a new way of generating synthetic data for computer vision tasks related to city scenes. Specifically, we describe a pipeline that can generate a 3D city model from an input of a 2D image that portrays the layout design of a city. This pipeline also takes optional parameters to further customize the output 3D city model. Using our pipeline, a virtual 3D city model with high-quality textures can be generated within seconds, and the output is an object ready to render. The model generated will assist people with limited 3D development knowledge to create high quality city scenes for different needs. As examples, we show the use of generated 3D city models as the synthetic data source for a scene text detection task and a traffic sign detection task. Both qualitative and quantitative results show that the generated virtual city is a good match to real-world data and potentially can benefit other computer vision tasks with similar contexts.
合成数据采集的自动虚拟三维城市生成
随着深度神经网络新技术的迅速发展,计算机视觉取得了优异的效果。野外目标检测是计算机视觉的核心任务,在现实世界中已经有了许多成功的应用。然而,用于目标检测的深度神经网络通常由数百层,有时甚至数千层组成。训练这样的网络是具有挑战性的,并且训练数据对模型性能具有根本性的影响。由于数据收集和标注成本高、劳动强度大,人们提出了许多数据增强方法来生成用于神经网络训练的合成数据。这些方法大多集中在处理二维图像上。与此相反,在本文中,我们利用3D环境的逼真视觉效果,提出了一种为与城市场景相关的计算机视觉任务生成合成数据的新方法。具体来说,我们描述了一个管道,它可以从描绘城市布局设计的2D图像的输入生成3D城市模型。该管道还接受可选参数,以进一步定制输出的3D城市模型。使用我们的管道,一个具有高质量纹理的虚拟3D城市模型可以在几秒钟内生成,输出是一个准备渲染的对象。生成的模型将帮助有限的3D开发知识的人创建高质量的城市场景,以满足不同的需求。作为示例,我们展示了将生成的3D城市模型作为场景文本检测任务和交通标志检测任务的合成数据源的使用。定性和定量结果都表明,生成的虚拟城市与现实世界的数据很好地匹配,并有可能使其他具有类似背景的计算机视觉任务受益。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信