A fully end-to-end deep learning approach for real-time simultaneous 3D reconstruction and material recognition

Cheng Zhao, Li Sun, R. Stolkin
{"title":"A fully end-to-end deep learning approach for real-time simultaneous 3D reconstruction and material recognition","authors":"Cheng Zhao, Li Sun, R. Stolkin","doi":"10.1109/ICAR.2017.8023499","DOIUrl":null,"url":null,"abstract":"This paper addresses the problem of simultaneous 3D reconstruction and material recognition and segmentation. Enabling robots to recognise different materials (concrete, metal etc.) in a scene is important for many tasks, e.g. robotic interventions in nuclear decommissioning. Previous work on 3D semantic reconstruction has predominantly focused on recognition of everyday domestic objects (tables, chairs etc.), whereas previous work on material recognition has largely been confined to single 2D images without any 3D reconstruction. Meanwhile, most 3D semantic reconstruction methods rely on computationally expensive post-processing, using Fully-Connected Conditional Random Fields (CRFs), to achieve consistent segmentations. In contrast, we propose a deep learning method which performs 3D reconstruction while simultaneously recognising different types of materials and labeling them at the pixel level. Unlike previous methods, we propose a fully end-to-end approach, which does not require hand-crafted features or CRF post-processing. Instead, we use only learned features, and the CRF segmentation constraints are incorporated inside the fully end-to-end learned system. We present the results of experiments, in which we trained our system to perform real-time 3D semantic reconstruction for 23 different materials in a real-world application. The run-time performance of the system can be boosted to around 10Hz, using a conventional GPU, which is enough to achieve realtime semantic reconstruction using a 30fps RGB-D camera. To the best of our knowledge, this work is the first real-time end-to-end system for simultaneous 3D reconstruction and material recognition.","PeriodicalId":198633,"journal":{"name":"2017 18th International Conference on Advanced Robotics (ICAR)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"38","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 18th International Conference on Advanced Robotics (ICAR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAR.2017.8023499","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 38

Abstract

This paper addresses the problem of simultaneous 3D reconstruction and material recognition and segmentation. Enabling robots to recognise different materials (concrete, metal etc.) in a scene is important for many tasks, e.g. robotic interventions in nuclear decommissioning. Previous work on 3D semantic reconstruction has predominantly focused on recognition of everyday domestic objects (tables, chairs etc.), whereas previous work on material recognition has largely been confined to single 2D images without any 3D reconstruction. Meanwhile, most 3D semantic reconstruction methods rely on computationally expensive post-processing, using Fully-Connected Conditional Random Fields (CRFs), to achieve consistent segmentations. In contrast, we propose a deep learning method which performs 3D reconstruction while simultaneously recognising different types of materials and labeling them at the pixel level. Unlike previous methods, we propose a fully end-to-end approach, which does not require hand-crafted features or CRF post-processing. Instead, we use only learned features, and the CRF segmentation constraints are incorporated inside the fully end-to-end learned system. We present the results of experiments, in which we trained our system to perform real-time 3D semantic reconstruction for 23 different materials in a real-world application. The run-time performance of the system can be boosted to around 10Hz, using a conventional GPU, which is enough to achieve realtime semantic reconstruction using a 30fps RGB-D camera. To the best of our knowledge, this work is the first real-time end-to-end system for simultaneous 3D reconstruction and material recognition.
一种完全端到端的深度学习方法,用于实时同步3D重建和材料识别
本文解决了同时进行三维重建和材料识别与分割的问题。使机器人能够识别场景中的不同材料(混凝土,金属等)对于许多任务都很重要,例如机器人干预核退役。之前关于3D语义重建的工作主要集中在日常家用物品(桌子、椅子等)的识别上,而之前关于材料识别的工作很大程度上局限于单个2D图像,没有任何3D重建。同时,大多数三维语义重建方法依赖于计算昂贵的后处理,使用全连接条件随机场(CRFs)来实现一致的分割。相比之下,我们提出了一种深度学习方法,该方法在进行3D重建的同时识别不同类型的材料并在像素级对其进行标记。与以前的方法不同,我们提出了一种完全端到端的方法,它不需要手工制作的功能或CRF后处理。相反,我们只使用学习到的特征,并将CRF分割约束合并到完全端到端学习系统中。我们展示了实验结果,在实验中,我们训练我们的系统在现实世界的应用中对23种不同的材料进行实时3D语义重建。使用传统的GPU,系统的运行时性能可以提升到10Hz左右,这足以使用30fps的RGB-D相机实现实时语义重建。据我们所知,这项工作是第一个同时进行3D重建和材料识别的实时端到端系统。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信