遮挡运动物体对cnn相机姿态回归影响的实例研究

2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS) Pub Date : 2019-09-01 DOI:10.1109/AVSS.2019.8909904

Claudio Cimarelli, Dario Cazzato, M. Olivares-Méndez, H. Voos

{"title":"遮挡运动物体对cnn相机姿态回归影响的实例研究","authors":"Claudio Cimarelli, Dario Cazzato, M. Olivares-Méndez, H. Voos","doi":"10.1109/AVSS.2019.8909904","DOIUrl":null,"url":null,"abstract":"Robot self-localization is essential for operating autonomously in open environments. When cameras are the main source of information for retrieving the pose, numerous challenges are posed by the presence of dynamic objects, due to occlusion and continuous changes in the appearance. Recent research on global localization methods focused on using a single (or multiple) Convolutional Neural Network (CNN) to estimate the 6 Degrees of Freedom (6-DoF) pose directly from a monocular camera image. In contrast with the classical approaches using engineered feature detector, CNNs are usually more robust to environmental changes in light and to occlusions in outdoor scenarios. This paper contains an attempt to empirically demonstrate the ability of CNNs to ignore dynamic elements, such as pedestrians or cars, through learning. For this purpose, we pre-process a dataset for pose localization with an object segmentation network, masking potentially moving objects. Hence, we compare the pose regression CNN trained and/or tested on the set of masked images and the original one. Experimental results show that the performances of the two training approaches are similar, with a slight reduction of the error when hiding occluding objects from the views.","PeriodicalId":243194,"journal":{"name":"2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"491 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A case study on the impact of masking moving objects on the camera pose regression with CNNs\",\"authors\":\"Claudio Cimarelli, Dario Cazzato, M. Olivares-Méndez, H. Voos\",\"doi\":\"10.1109/AVSS.2019.8909904\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Robot self-localization is essential for operating autonomously in open environments. When cameras are the main source of information for retrieving the pose, numerous challenges are posed by the presence of dynamic objects, due to occlusion and continuous changes in the appearance. Recent research on global localization methods focused on using a single (or multiple) Convolutional Neural Network (CNN) to estimate the 6 Degrees of Freedom (6-DoF) pose directly from a monocular camera image. In contrast with the classical approaches using engineered feature detector, CNNs are usually more robust to environmental changes in light and to occlusions in outdoor scenarios. This paper contains an attempt to empirically demonstrate the ability of CNNs to ignore dynamic elements, such as pedestrians or cars, through learning. For this purpose, we pre-process a dataset for pose localization with an object segmentation network, masking potentially moving objects. Hence, we compare the pose regression CNN trained and/or tested on the set of masked images and the original one. Experimental results show that the performances of the two training approaches are similar, with a slight reduction of the error when hiding occluding objects from the views.\",\"PeriodicalId\":243194,\"journal\":{\"name\":\"2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)\",\"volume\":\"491 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/AVSS.2019.8909904\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AVSS.2019.8909904","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

机器人的自定位是在开放环境中自主操作的必要条件。当相机是检索姿势的主要信息来源时，由于遮挡和外观的连续变化，动态物体的存在带来了许多挑战。最近关于全局定位方法的研究主要集中在使用单个(或多个)卷积神经网络(CNN)直接从单目相机图像中估计6自由度(6- dof)姿态。与使用工程特征检测器的经典方法相比，cnn通常对光线的环境变化和室外场景的遮挡具有更强的鲁棒性。本文试图通过经验证明cnn通过学习忽略动态元素(如行人或汽车)的能力。为此，我们使用对象分割网络预处理数据集进行姿态定位，屏蔽潜在的移动对象。因此，我们比较了CNN在蒙面图像集和原始图像集上训练和/或测试的姿势回归。实验结果表明，两种训练方法的性能相似，在隐藏遮挡物体时的误差略有降低。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A case study on the impact of masking moving objects on the camera pose regression with CNNs

Robot self-localization is essential for operating autonomously in open environments. When cameras are the main source of information for retrieving the pose, numerous challenges are posed by the presence of dynamic objects, due to occlusion and continuous changes in the appearance. Recent research on global localization methods focused on using a single (or multiple) Convolutional Neural Network (CNN) to estimate the 6 Degrees of Freedom (6-DoF) pose directly from a monocular camera image. In contrast with the classical approaches using engineered feature detector, CNNs are usually more robust to environmental changes in light and to occlusions in outdoor scenarios. This paper contains an attempt to empirically demonstrate the ability of CNNs to ignore dynamic elements, such as pedestrians or cars, through learning. For this purpose, we pre-process a dataset for pose localization with an object segmentation network, masking potentially moving objects. Hence, we compare the pose regression CNN trained and/or tested on the set of masked images and the original one. Experimental results show that the performances of the two training approaches are similar, with a slight reduction of the error when hiding occluding objects from the views.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)

自引率

0.00%

发文量