基于深度学习的视频流重新定位

2018 5th International Conference on Systems and Informatics (ICSAI) Pub Date : 2018-11-01 DOI:10.1109/ICSAI.2018.8599392

Tingting Hu, Hanxu Sun

{"title":"基于深度学习的视频流重新定位","authors":"Tingting Hu, Hanxu Sun","doi":"10.1109/ICSAI.2018.8599392","DOIUrl":null,"url":null,"abstract":"This paper presents a six degree of freedom regression system using convolution neural network(CNN) and long and short term memory network(LSTM) with video stream as network inputs. The system trains the network to regress the 6-DOF robot pose in a transfer learning and end-to-end manner with little training data. Relocalization only using CNN ignore the temporal correlation between image-sequences. In fact, the robot can easily collect continuous image-sequences. Therefore, in this paper, the robot can regress to the 6-DOF pose according to continuous images of different step sizes. Compared with relocalization with a single image, the experimental results show that the network model has the best effect of relocalization when the step size is set to 10 in the indoor scene, and the error of relocalization is minimal.","PeriodicalId":375852,"journal":{"name":"2018 5th International Conference on Systems and Informatics (ICSAI)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Video Stream Relocalization With Deep Learning\",\"authors\":\"Tingting Hu, Hanxu Sun\",\"doi\":\"10.1109/ICSAI.2018.8599392\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents a six degree of freedom regression system using convolution neural network(CNN) and long and short term memory network(LSTM) with video stream as network inputs. The system trains the network to regress the 6-DOF robot pose in a transfer learning and end-to-end manner with little training data. Relocalization only using CNN ignore the temporal correlation between image-sequences. In fact, the robot can easily collect continuous image-sequences. Therefore, in this paper, the robot can regress to the 6-DOF pose according to continuous images of different step sizes. Compared with relocalization with a single image, the experimental results show that the network model has the best effect of relocalization when the step size is set to 10 in the indoor scene, and the error of relocalization is minimal.\",\"PeriodicalId\":375852,\"journal\":{\"name\":\"2018 5th International Conference on Systems and Informatics (ICSAI)\",\"volume\":\"6 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 5th International Conference on Systems and Informatics (ICSAI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSAI.2018.8599392\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 5th International Conference on Systems and Informatics (ICSAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSAI.2018.8599392","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

提出了一种以视频流为网络输入，采用卷积神经网络(CNN)和长短期记忆网络(LSTM)的六自由度回归系统。该系统在训练数据较少的情况下，以迁移学习和端到端方式训练网络回归六自由度机器人姿态。仅使用CNN进行重定位，忽略了图像序列之间的时间相关性。事实上，机器人可以很容易地收集连续图像序列。因此，在本文中，机器人可以根据不同步长的连续图像回归到六自由度位姿。实验结果表明，在室内场景中，当步长设置为10时，网络模型的重定位效果最好，且重定位误差最小。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Video Stream Relocalization With Deep Learning

This paper presents a six degree of freedom regression system using convolution neural network(CNN) and long and short term memory network(LSTM) with video stream as network inputs. The system trains the network to regress the 6-DOF robot pose in a transfer learning and end-to-end manner with little training data. Relocalization only using CNN ignore the temporal correlation between image-sequences. In fact, the robot can easily collect continuous image-sequences. Therefore, in this paper, the robot can regress to the 6-DOF pose according to continuous images of different step sizes. Compared with relocalization with a single image, the experimental results show that the network model has the best effect of relocalization when the step size is set to 10 in the indoor scene, and the error of relocalization is minimal.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2018 5th International Conference on Systems and Informatics (ICSAI)

自引率

0.00%

发文量