动态环境中自我改进的SLAM:学习何时屏蔽

BMVC : proceedings of the British Machine Vision Conference. British Machine Vision Conference Pub Date : 2022-10-15 DOI:10.48550/arXiv.2210.08350

Adrian Bojko, R. Dupont, M. Tamaazousti, H. Borgne

{"title":"动态环境中自我改进的SLAM:学习何时屏蔽","authors":"Adrian Bojko, R. Dupont, M. Tamaazousti, H. Borgne","doi":"10.48550/arXiv.2210.08350","DOIUrl":null,"url":null,"abstract":"Visual SLAM - Simultaneous Localization and Mapping - in dynamic environments typically relies on identifying and masking image features on moving objects to prevent them from negatively affecting performance. Current approaches are suboptimal: they either fail to mask objects when needed or, on the contrary, mask objects needlessly. Thus, we propose a novel SLAM that learns when masking objects improves its performance in dynamic scenarios. Given a method to segment objects and a SLAM, we give the latter the ability of Temporal Masking, i.e., to infer when certain classes of objects should be masked to maximize any given SLAM metric. We do not make any priors on motion: our method learns to mask moving objects by itself. To prevent high annotations costs, we created an automatic annotation method for self-supervised training. We constructed a new dataset, named ConsInv, which includes challenging real-world dynamic sequences respectively indoors and outdoors. Our method reaches the state of the art on the TUM RGB-D dataset and outperforms it on KITTI and ConsInv datasets.","PeriodicalId":72437,"journal":{"name":"BMVC : proceedings of the British Machine Vision Conference. British Machine Vision Conference","volume":"26 1","pages":"654"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Self-Improving SLAM in Dynamic Environments: Learning When to Mask\",\"authors\":\"Adrian Bojko, R. Dupont, M. Tamaazousti, H. Borgne\",\"doi\":\"10.48550/arXiv.2210.08350\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Visual SLAM - Simultaneous Localization and Mapping - in dynamic environments typically relies on identifying and masking image features on moving objects to prevent them from negatively affecting performance. Current approaches are suboptimal: they either fail to mask objects when needed or, on the contrary, mask objects needlessly. Thus, we propose a novel SLAM that learns when masking objects improves its performance in dynamic scenarios. Given a method to segment objects and a SLAM, we give the latter the ability of Temporal Masking, i.e., to infer when certain classes of objects should be masked to maximize any given SLAM metric. We do not make any priors on motion: our method learns to mask moving objects by itself. To prevent high annotations costs, we created an automatic annotation method for self-supervised training. We constructed a new dataset, named ConsInv, which includes challenging real-world dynamic sequences respectively indoors and outdoors. Our method reaches the state of the art on the TUM RGB-D dataset and outperforms it on KITTI and ConsInv datasets.\",\"PeriodicalId\":72437,\"journal\":{\"name\":\"BMVC : proceedings of the British Machine Vision Conference. British Machine Vision Conference\",\"volume\":\"26 1\",\"pages\":\"654\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"BMVC : proceedings of the British Machine Vision Conference. British Machine Vision Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.48550/arXiv.2210.08350\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMVC : proceedings of the British Machine Vision Conference. British Machine Vision Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2210.08350","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

Visual SLAM——同步定位和映射——在动态环境中，通常依赖于识别和屏蔽移动对象上的图像特征，以防止它们对性能产生负面影响。当前的方法是次优的:它们要么不能在需要时屏蔽对象，要么相反，不必要地屏蔽对象。因此，我们提出了一种新的SLAM，它可以学习何时屏蔽对象，从而提高其在动态场景中的性能。给定一种分割对象和SLAM的方法，我们为后者提供了时间屏蔽的能力，即推断何时应该屏蔽某些类型的对象以最大化任何给定的SLAM度量。我们不做任何先验的运动:我们的方法自己学习遮罩运动的物体。为了避免高标注成本，我们创建了一种用于自监督训练的自动标注方法。我们构建了一个名为ConsInv的新数据集，其中包括室内和室外的挑战现实世界的动态序列。我们的方法在TUM RGB-D数据集上达到了最先进的水平，并在KITTI和ConsInv数据集上优于它。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Self-Improving SLAM in Dynamic Environments: Learning When to Mask

Visual SLAM - Simultaneous Localization and Mapping - in dynamic environments typically relies on identifying and masking image features on moving objects to prevent them from negatively affecting performance. Current approaches are suboptimal: they either fail to mask objects when needed or, on the contrary, mask objects needlessly. Thus, we propose a novel SLAM that learns when masking objects improves its performance in dynamic scenarios. Given a method to segment objects and a SLAM, we give the latter the ability of Temporal Masking, i.e., to infer when certain classes of objects should be masked to maximize any given SLAM metric. We do not make any priors on motion: our method learns to mask moving objects by itself. To prevent high annotations costs, we created an automatic annotation method for self-supervised training. We constructed a new dataset, named ConsInv, which includes challenging real-world dynamic sequences respectively indoors and outdoors. Our method reaches the state of the art on the TUM RGB-D dataset and outperforms it on KITTI and ConsInv datasets.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

BMVC : proceedings of the British Machine Vision Conference. British Machine Vision Conference

自引率

0.00%

发文量