一种融合视觉和异方差运动估计的深度神经网络方法，用于低swap机器人应用

2017 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI) Pub Date : 2017-11-01 DOI:10.1109/MFI.2017.8170407

Jared Shamwell, W. Nothwang, D. Perlis

{"title":"一种融合视觉和异方差运动估计的深度神经网络方法，用于低swap机器人应用","authors":"Jared Shamwell, W. Nothwang, D. Perlis","doi":"10.1109/MFI.2017.8170407","DOIUrl":null,"url":null,"abstract":"Due both to the speed and quality of their sensors and restrictive on-board computational capabilities, current state-of-the-art (SOA) size, weight, and power (SWaP) constrained autonomous robotic systems are limited in their abilities to sample, fuse, and analyze sensory data for state estimation. Aimed at improving SWaP-constrained robotic state estimation, we present Multi-Hypothesis DeepEfference (MHDE) — an unsupervised, deep convolutional-deconvolutional sensor fusion network that learns to intelligently combine noisy heterogeneous sensor data to predict several probable hypotheses for the dense, pixel-level correspondence between a source image and an unseen target image. This new multi-hypothesis formulation of our previous architecture, DeepEfference [1], has been augmented to handle dynamic heteroscedastic sensor and motion noise and computes hypothesis image mappings and predictions at 150–400 Hz depending on the number of hypotheses being generated. MHDE fuses noisy, heterogeneous sensory inputs using two parallel architectural pathways and n (1, 2, 4, or 8 in this work) multi-hypothesis generation subpathways to generate n pixel-level predictions and correspondences between source and target images. We evaluated MHDE on the KITTI Odometry dataset [2] and benchmarked it against DeepEfference [1] and DeepMatching [3] by mean pixel error and runtime. MHDE with 8 hypotheses outperformed DeepEfference in root mean squared (RMSE) pixel error by 103% in the maximum heteroscedastic noise condition and by 18% in the noise-free condition. MHDE with 8 hypotheses was over 5, 000% faster than DeepMatching with only a 3% increase in RMSE.","PeriodicalId":402371,"journal":{"name":"2017 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"A deep neural network approach to fusing vision and heteroscedastic motion estimates for low-SWaP robotic applications\",\"authors\":\"Jared Shamwell, W. Nothwang, D. Perlis\",\"doi\":\"10.1109/MFI.2017.8170407\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Due both to the speed and quality of their sensors and restrictive on-board computational capabilities, current state-of-the-art (SOA) size, weight, and power (SWaP) constrained autonomous robotic systems are limited in their abilities to sample, fuse, and analyze sensory data for state estimation. Aimed at improving SWaP-constrained robotic state estimation, we present Multi-Hypothesis DeepEfference (MHDE) — an unsupervised, deep convolutional-deconvolutional sensor fusion network that learns to intelligently combine noisy heterogeneous sensor data to predict several probable hypotheses for the dense, pixel-level correspondence between a source image and an unseen target image. This new multi-hypothesis formulation of our previous architecture, DeepEfference [1], has been augmented to handle dynamic heteroscedastic sensor and motion noise and computes hypothesis image mappings and predictions at 150–400 Hz depending on the number of hypotheses being generated. MHDE fuses noisy, heterogeneous sensory inputs using two parallel architectural pathways and n (1, 2, 4, or 8 in this work) multi-hypothesis generation subpathways to generate n pixel-level predictions and correspondences between source and target images. We evaluated MHDE on the KITTI Odometry dataset [2] and benchmarked it against DeepEfference [1] and DeepMatching [3] by mean pixel error and runtime. MHDE with 8 hypotheses outperformed DeepEfference in root mean squared (RMSE) pixel error by 103% in the maximum heteroscedastic noise condition and by 18% in the noise-free condition. MHDE with 8 hypotheses was over 5, 000% faster than DeepMatching with only a 3% increase in RMSE.\",\"PeriodicalId\":402371,\"journal\":{\"name\":\"2017 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MFI.2017.8170407\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MFI.2017.8170407","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

摘要

由于传感器的速度和质量以及机载计算能力的限制，当前最先进的(SOA)尺寸、重量和功率(SWaP)受限的自主机器人系统在采样、融合和分析用于状态估计的传感器数据方面受到限制。为了改进swap约束下的机器人状态估计，我们提出了多假设深度差分(Multi-Hypothesis deepedifference, MHDE)——一种无监督的深度卷积-反卷积传感器融合网络，它学习智能地组合有噪声的异构传感器数据，以预测源图像和看不见的目标图像之间密集的像素级对应的几个可能的假设。我们之前架构的这种新的多假设公式deepedifference[1]已经得到增强，可以处理动态异方差传感器和运动噪声，并根据生成的假设数量计算150-400 Hz的假设图像映射和预测。MHDE使用两个并行的结构路径和n(本研究中为1、2、4或8)个多假设生成子路径融合噪声、异构的感官输入，以生成n个像素级的预测和源图像和目标图像之间的对应关系。我们在KITTI Odometry数据集[2]上评估了MHDE，并通过平均像素误差和运行时间对其进行了deepedifference[1]和DeepMatching[3]的基准测试。具有8个假设的MHDE在最大异方差噪声条件下的均方根(RMSE)像素误差优于deepedifference 103%，在无噪声条件下优于18%。有8个假设的MHDE比DeepMatching快5000 %，RMSE只增加了3%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A deep neural network approach to fusing vision and heteroscedastic motion estimates for low-SWaP robotic applications

Due both to the speed and quality of their sensors and restrictive on-board computational capabilities, current state-of-the-art (SOA) size, weight, and power (SWaP) constrained autonomous robotic systems are limited in their abilities to sample, fuse, and analyze sensory data for state estimation. Aimed at improving SWaP-constrained robotic state estimation, we present Multi-Hypothesis DeepEfference (MHDE) — an unsupervised, deep convolutional-deconvolutional sensor fusion network that learns to intelligently combine noisy heterogeneous sensor data to predict several probable hypotheses for the dense, pixel-level correspondence between a source image and an unseen target image. This new multi-hypothesis formulation of our previous architecture, DeepEfference [1], has been augmented to handle dynamic heteroscedastic sensor and motion noise and computes hypothesis image mappings and predictions at 150–400 Hz depending on the number of hypotheses being generated. MHDE fuses noisy, heterogeneous sensory inputs using two parallel architectural pathways and n (1, 2, 4, or 8 in this work) multi-hypothesis generation subpathways to generate n pixel-level predictions and correspondences between source and target images. We evaluated MHDE on the KITTI Odometry dataset [2] and benchmarked it against DeepEfference [1] and DeepMatching [3] by mean pixel error and runtime. MHDE with 8 hypotheses outperformed DeepEfference in root mean squared (RMSE) pixel error by 103% in the maximum heteroscedastic noise condition and by 18% in the noise-free condition. MHDE with 8 hypotheses was over 5, 000% faster than DeepMatching with only a 3% increase in RMSE.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2017 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)

自引率

0.00%

发文量