拍摄标签:虚拟现实的3D语义标签

Pierluigi Zama Ramirez, Claudio Paternesi, Daniele De Gregorio, L. D. Stefano
{"title":"拍摄标签:虚拟现实的3D语义标签","authors":"Pierluigi Zama Ramirez, Claudio Paternesi, Daniele De Gregorio, L. D. Stefano","doi":"10.1109/AIVR50618.2020.00027","DOIUrl":null,"url":null,"abstract":"Availability of a few, large-size, annotated datasets, like ImageNet, Pascal VOC and COCO, has lead deep learning to revolutionize computer vision research by achieving astonishing results in several vision tasks. We argue that new tools to facilitate generation of annotated datasets may help spreading data-driven AI throughout applications and domains. In this work we propose Shooting Labels, the first 3D labeling tool for dense 3D semantic segmentation which exploits Virtual Reality to render the labeling task as easy and fun as playing a video-game. Our tool allows for semantically labeling large scale environments very expeditiously, whatever the nature of the 3D data at hand (e.g. point clouds, mesh). Furthermore, Shooting Labels efficiently integrates multiusers annotations to improve the labeling accuracy automatically and compute a label uncertainty map. Besides, within our framework the 3D annotations can be projected into 2D images, thereby speeding up also a notoriously slow and expensive task such as pixel-wise semantic labeling. We demonstrate the accuracy and efficiency of our tool in two different scenarios: an indoor workspace provided by Matterport3D and a large-scale outdoor environment reconstructed from 1000+ KITTI images.","PeriodicalId":348199,"journal":{"name":"2020 IEEE International Conference on Artificial Intelligence and Virtual Reality (AIVR)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":"{\"title\":\"Shooting Labels: 3D Semantic Labeling by Virtual Reality\",\"authors\":\"Pierluigi Zama Ramirez, Claudio Paternesi, Daniele De Gregorio, L. D. Stefano\",\"doi\":\"10.1109/AIVR50618.2020.00027\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Availability of a few, large-size, annotated datasets, like ImageNet, Pascal VOC and COCO, has lead deep learning to revolutionize computer vision research by achieving astonishing results in several vision tasks. We argue that new tools to facilitate generation of annotated datasets may help spreading data-driven AI throughout applications and domains. In this work we propose Shooting Labels, the first 3D labeling tool for dense 3D semantic segmentation which exploits Virtual Reality to render the labeling task as easy and fun as playing a video-game. Our tool allows for semantically labeling large scale environments very expeditiously, whatever the nature of the 3D data at hand (e.g. point clouds, mesh). Furthermore, Shooting Labels efficiently integrates multiusers annotations to improve the labeling accuracy automatically and compute a label uncertainty map. Besides, within our framework the 3D annotations can be projected into 2D images, thereby speeding up also a notoriously slow and expensive task such as pixel-wise semantic labeling. We demonstrate the accuracy and efficiency of our tool in two different scenarios: an indoor workspace provided by Matterport3D and a large-scale outdoor environment reconstructed from 1000+ KITTI images.\",\"PeriodicalId\":348199,\"journal\":{\"name\":\"2020 IEEE International Conference on Artificial Intelligence and Virtual Reality (AIVR)\",\"volume\":\"11 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-10-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE International Conference on Artificial Intelligence and Virtual Reality (AIVR)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/AIVR50618.2020.00027\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Conference on Artificial Intelligence and Virtual Reality (AIVR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AIVR50618.2020.00027","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12

摘要

ImageNet、Pascal VOC和COCO等一些大型带注释的数据集的可用性,通过在几个视觉任务中取得惊人的结果,使深度学习彻底改变了计算机视觉研究。我们认为,促进生成注释数据集的新工具可能有助于在整个应用程序和领域中传播数据驱动的人工智能。在这项工作中,我们提出了射击标签,这是第一个用于密集3D语义分割的3D标记工具,它利用虚拟现实使标记任务像玩视频游戏一样简单有趣。我们的工具允许非常快速地对大规模环境进行语义标记,无论手头的3D数据的性质如何(例如点云,网格)。此外,拍摄标签有效地集成了多用户标注,自动提高标注精度,并计算出标签不确定度图。此外,在我们的框架内,3D注释可以投影到2D图像中,从而加快了众所周知的缓慢和昂贵的任务,如逐像素语义标记。我们在两个不同的场景中展示了我们工具的准确性和效率:一个是由Matterport3D提供的室内工作空间,另一个是由1000多张KITTI图像重建的大型室外环境。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Shooting Labels: 3D Semantic Labeling by Virtual Reality
Availability of a few, large-size, annotated datasets, like ImageNet, Pascal VOC and COCO, has lead deep learning to revolutionize computer vision research by achieving astonishing results in several vision tasks. We argue that new tools to facilitate generation of annotated datasets may help spreading data-driven AI throughout applications and domains. In this work we propose Shooting Labels, the first 3D labeling tool for dense 3D semantic segmentation which exploits Virtual Reality to render the labeling task as easy and fun as playing a video-game. Our tool allows for semantically labeling large scale environments very expeditiously, whatever the nature of the 3D data at hand (e.g. point clouds, mesh). Furthermore, Shooting Labels efficiently integrates multiusers annotations to improve the labeling accuracy automatically and compute a label uncertainty map. Besides, within our framework the 3D annotations can be projected into 2D images, thereby speeding up also a notoriously slow and expensive task such as pixel-wise semantic labeling. We demonstrate the accuracy and efficiency of our tool in two different scenarios: an indoor workspace provided by Matterport3D and a large-scale outdoor environment reconstructed from 1000+ KITTI images.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信