基于深度学习的航空图像实例分割

Felipe X. Viana, Gabriel Araújo, M. Pinto, J. Colares, D. B. Haddad
{"title":"基于深度学习的航空图像实例分割","authors":"Felipe X. Viana, Gabriel Araújo, M. Pinto, J. Colares, D. B. Haddad","doi":"10.21528/lnlm-vol18-no1-art3","DOIUrl":null,"url":null,"abstract":"In the last decades, current trends in autonomous navigation have demonstrated an increased use of computational vision over traditional techniques. This relies on the fact that most of the spaces are designed for human navigation. As a result, they are filled with visual cues. In this sense, visual recognition is an essential ability to avoid obstacles when an autonomous vehicle interacts with the real world. Data collection using Unmanned Aerial Vehicles (UAVs) navigating in a real-world scenario is a high-cost and time-expensive activity. For this reason, one of the most valuable assets of technology companies is a database containing locations and interactions. One solution to this problem is the adoption of a photo-realistic 3D simulator as a data source. Using this resource, it is possible to gather a significant amount of data. Therefore, this research creates a dataset for instance segmentation using images from a frontal UAV camera navigating in a 3D simulator. This work applies a state-of-the-art deep learning technique, the Mask-RCNN. The architecture takes an image input and predicts per-pixel instance segmentation. Experimental results showed that Mask RCNN has superior performance in our dataset when refining a model trained using COCO dataset. Besides, the proposed methodology presents a good generalization capability due to the promising results in real-world data.","PeriodicalId":386768,"journal":{"name":"Learning and Nonlinear Models","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Aerial Image Instance Segmentation Through Synthetic Data Using Deep Learning\",\"authors\":\"Felipe X. Viana, Gabriel Araújo, M. Pinto, J. Colares, D. B. Haddad\",\"doi\":\"10.21528/lnlm-vol18-no1-art3\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the last decades, current trends in autonomous navigation have demonstrated an increased use of computational vision over traditional techniques. This relies on the fact that most of the spaces are designed for human navigation. As a result, they are filled with visual cues. In this sense, visual recognition is an essential ability to avoid obstacles when an autonomous vehicle interacts with the real world. Data collection using Unmanned Aerial Vehicles (UAVs) navigating in a real-world scenario is a high-cost and time-expensive activity. For this reason, one of the most valuable assets of technology companies is a database containing locations and interactions. One solution to this problem is the adoption of a photo-realistic 3D simulator as a data source. Using this resource, it is possible to gather a significant amount of data. Therefore, this research creates a dataset for instance segmentation using images from a frontal UAV camera navigating in a 3D simulator. This work applies a state-of-the-art deep learning technique, the Mask-RCNN. The architecture takes an image input and predicts per-pixel instance segmentation. Experimental results showed that Mask RCNN has superior performance in our dataset when refining a model trained using COCO dataset. Besides, the proposed methodology presents a good generalization capability due to the promising results in real-world data.\",\"PeriodicalId\":386768,\"journal\":{\"name\":\"Learning and Nonlinear Models\",\"volume\":\"36 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Learning and Nonlinear Models\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.21528/lnlm-vol18-no1-art3\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Learning and Nonlinear Models","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21528/lnlm-vol18-no1-art3","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

摘要

在过去的几十年里,当前自主导航的趋势表明,与传统技术相比,计算视觉的使用越来越多。这取决于大多数空间是为人类导航而设计的这一事实。因此,他们充满了视觉线索。从这个意义上说,当自动驾驶汽车与现实世界互动时,视觉识别是避开障碍物的基本能力。在现实世界中,使用无人机(uav)导航进行数据收集是一项成本高、耗时长的活动。出于这个原因,科技公司最有价值的资产之一是包含位置和交互的数据库。解决这个问题的一个方法是采用逼真的3D模拟器作为数据源。使用此资源,可以收集大量数据。因此,本研究使用在3D模拟器中导航的正面无人机相机图像创建实例分割数据集。这项工作应用了最先进的深度学习技术——Mask-RCNN。该架构接受图像输入并预测逐像素实例分割。实验结果表明,Mask RCNN在我们的数据集上对使用COCO数据集训练的模型进行细化时具有优越的性能。此外,由于该方法在实际数据中的结果令人满意,因此具有良好的泛化能力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Aerial Image Instance Segmentation Through Synthetic Data Using Deep Learning
In the last decades, current trends in autonomous navigation have demonstrated an increased use of computational vision over traditional techniques. This relies on the fact that most of the spaces are designed for human navigation. As a result, they are filled with visual cues. In this sense, visual recognition is an essential ability to avoid obstacles when an autonomous vehicle interacts with the real world. Data collection using Unmanned Aerial Vehicles (UAVs) navigating in a real-world scenario is a high-cost and time-expensive activity. For this reason, one of the most valuable assets of technology companies is a database containing locations and interactions. One solution to this problem is the adoption of a photo-realistic 3D simulator as a data source. Using this resource, it is possible to gather a significant amount of data. Therefore, this research creates a dataset for instance segmentation using images from a frontal UAV camera navigating in a 3D simulator. This work applies a state-of-the-art deep learning technique, the Mask-RCNN. The architecture takes an image input and predicts per-pixel instance segmentation. Experimental results showed that Mask RCNN has superior performance in our dataset when refining a model trained using COCO dataset. Besides, the proposed methodology presents a good generalization capability due to the promising results in real-world data.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信