安全室内自主微型飞行器导航的深度模仿学习

2020 IEEE 12th International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment, and Management (HNICEM) Pub Date : 2020-12-03 DOI:10.1109/HNICEM51456.2020.9399994

R. J. Candare, Rolyn C. Daguil

{"title":"安全室内自主微型飞行器导航的深度模仿学习","authors":"R. J. Candare, Rolyn C. Daguil","doi":"10.1109/HNICEM51456.2020.9399994","DOIUrl":null,"url":null,"abstract":"Drones or MAVs are cyber-physical systems that have become ideal platforms for many applications in outdoor settings. Autonomous navigation in these outdoor settings has been effectively done using global positioning systems (GPS). Many human MAV pilots have demonstrated skills in controlling MAVs to maneuver in narrowed spaces indoors. However, this skill is hard to automate. Global positioning systems are unreliable indoors and in cluttered and confined environments. SLAM, being the most widely used analytical method, addresses the navigation problem by rendering a spatial map of an environment in which the Agent navigates while simultaneously localizing the Agent relative to this map. The main downside of SLAM is that rendering an entire map requires a large amount of computation, and rule-based techniques often lose their robustness in corner cases or situations that were not accounted for in developing the rules. Hence, learning directly from human demonstrations could produce improved results for complex tasks, particularly in sensor-limited systems. In this study, a policy that safely navigates a MAV through an indoor environment is learned through deep imitation learning. To effectively learn a policy that is robust to the domain or environment shifts, an ideal combination of monocular depth estimate and dense optical flow was determined to serve as state representation. Three different deep convolutional neural network architectures, namely, CNN, LSTM-RNN, and 3D CNN, were explored and developed to encode the navigation policy from expert demonstrations. The performance of these policies was then tested in a real environment. Results show that the CNN and 3D CNN policies successfully navigated the MAV around the obstacle set in the test environment while the LSTM-RNN did not. The Success Rate for CNN, LSTM-RNN, and 3D CNN were 90%,0%, and 90%, respectively.","PeriodicalId":230810,"journal":{"name":"2020 IEEE 12th International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment, and Management (HNICEM)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Deep Imitation Learning for Safe Indoor Autonomous Micro Aerial Vehicle Navigation\",\"authors\":\"R. J. Candare, Rolyn C. Daguil\",\"doi\":\"10.1109/HNICEM51456.2020.9399994\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Drones or MAVs are cyber-physical systems that have become ideal platforms for many applications in outdoor settings. Autonomous navigation in these outdoor settings has been effectively done using global positioning systems (GPS). Many human MAV pilots have demonstrated skills in controlling MAVs to maneuver in narrowed spaces indoors. However, this skill is hard to automate. Global positioning systems are unreliable indoors and in cluttered and confined environments. SLAM, being the most widely used analytical method, addresses the navigation problem by rendering a spatial map of an environment in which the Agent navigates while simultaneously localizing the Agent relative to this map. The main downside of SLAM is that rendering an entire map requires a large amount of computation, and rule-based techniques often lose their robustness in corner cases or situations that were not accounted for in developing the rules. Hence, learning directly from human demonstrations could produce improved results for complex tasks, particularly in sensor-limited systems. In this study, a policy that safely navigates a MAV through an indoor environment is learned through deep imitation learning. To effectively learn a policy that is robust to the domain or environment shifts, an ideal combination of monocular depth estimate and dense optical flow was determined to serve as state representation. Three different deep convolutional neural network architectures, namely, CNN, LSTM-RNN, and 3D CNN, were explored and developed to encode the navigation policy from expert demonstrations. The performance of these policies was then tested in a real environment. Results show that the CNN and 3D CNN policies successfully navigated the MAV around the obstacle set in the test environment while the LSTM-RNN did not. The Success Rate for CNN, LSTM-RNN, and 3D CNN were 90%,0%, and 90%, respectively.\",\"PeriodicalId\":230810,\"journal\":{\"name\":\"2020 IEEE 12th International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment, and Management (HNICEM)\",\"volume\":\"19 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE 12th International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment, and Management (HNICEM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HNICEM51456.2020.9399994\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 12th International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment, and Management (HNICEM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HNICEM51456.2020.9399994","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

无人机或MAVs是网络物理系统，已成为户外环境中许多应用的理想平台。利用全球定位系统(GPS)，这些户外环境中的自主导航已经有效地完成了。许多人类MAV飞行员已经展示了控制MAV在室内狭窄空间内机动的技能。然而，这项技能很难实现自动化。全球定位系统在室内和杂乱的密闭环境中不可靠。SLAM是使用最广泛的分析方法，它通过呈现Agent所导航的环境的空间地图来解决导航问题，同时相对于该地图对Agent进行定位。SLAM的主要缺点是，呈现整个地图需要大量的计算，并且基于规则的技术经常在边缘情况或在开发规则时没有考虑到的情况下失去其健壮性。因此，直接从人类演示中学习可以在复杂任务中产生更好的结果，特别是在传感器有限的系统中。在本研究中，通过深度模仿学习来学习安全导航MAV通过室内环境的策略。为了有效地学习对领域或环境变化具有鲁棒性的策略，确定了单目深度估计和密集光流的理想组合作为状态表示。探索并开发了三种不同的深度卷积神经网络架构，即CNN、LSTM-RNN和3D CNN，以对专家演示的导航策略进行编码。然后在真实环境中测试这些策略的性能。结果表明，在测试环境中，CNN和3D CNN策略成功地引导了MAV绕过障碍集，而LSTM-RNN则没有。CNN、LSTM-RNN和3D CNN的成功率分别为90%、0%和90%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Deep Imitation Learning for Safe Indoor Autonomous Micro Aerial Vehicle Navigation

Drones or MAVs are cyber-physical systems that have become ideal platforms for many applications in outdoor settings. Autonomous navigation in these outdoor settings has been effectively done using global positioning systems (GPS). Many human MAV pilots have demonstrated skills in controlling MAVs to maneuver in narrowed spaces indoors. However, this skill is hard to automate. Global positioning systems are unreliable indoors and in cluttered and confined environments. SLAM, being the most widely used analytical method, addresses the navigation problem by rendering a spatial map of an environment in which the Agent navigates while simultaneously localizing the Agent relative to this map. The main downside of SLAM is that rendering an entire map requires a large amount of computation, and rule-based techniques often lose their robustness in corner cases or situations that were not accounted for in developing the rules. Hence, learning directly from human demonstrations could produce improved results for complex tasks, particularly in sensor-limited systems. In this study, a policy that safely navigates a MAV through an indoor environment is learned through deep imitation learning. To effectively learn a policy that is robust to the domain or environment shifts, an ideal combination of monocular depth estimate and dense optical flow was determined to serve as state representation. Three different deep convolutional neural network architectures, namely, CNN, LSTM-RNN, and 3D CNN, were explored and developed to encode the navigation policy from expert demonstrations. The performance of these policies was then tested in a real environment. Results show that the CNN and 3D CNN policies successfully navigated the MAV around the obstacle set in the test environment while the LSTM-RNN did not. The Success Rate for CNN, LSTM-RNN, and 3D CNN were 90%,0%, and 90%, respectively.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2020 IEEE 12th International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment, and Management (HNICEM)

自引率

0.00%

发文量