MonoSOD:基于预测深度的单目显著目标检测

2021 IEEE International Conference on Robotics and Automation (ICRA) Pub Date : 2021-05-30 DOI:10.1109/ICRA48506.2021.9561211

George Dimas, Panagiota Gatoula, D. Iakovidis

{"title":"MonoSOD:基于预测深度的单目显著目标检测","authors":"George Dimas, Panagiota Gatoula, D. Iakovidis","doi":"10.1109/ICRA48506.2021.9561211","DOIUrl":null,"url":null,"abstract":"Salient object detection (SOD) can directly improve the performance of tasks like obstacle detection, semantic segmentation and object recognition. Such tasks are important for robotic and other autonomous navigation systems. State-of-the-art SOD methodologies, provide improved performance by incorporating depth information, usually acquired using additional specialized sensors, e.g., RGB-D cameras. This introduces an overhead to the overall cost and flexibility of such systems. Nevertheless, the recent advances of machine learning, have provided models, capable of generating depth map approximations, given a single RGB image. In this work, we propose a novel monocular SOD (MonoSOD) methodology, based on a two-branch CNN autoencoder architecture capable of predicting depth maps and estimating saliency through a trainable refinement scheme. Its application on benchmark datasets, indicates that its performance is comparable to that of state-of-the-art SOD methods relying on RGB-D data. Therefore, it could be considered as a lower-cost alternative of such methods for future applications.","PeriodicalId":108312,"journal":{"name":"2021 IEEE International Conference on Robotics and Automation (ICRA)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"MonoSOD: Monocular Salient Object Detection based on Predicted Depth\",\"authors\":\"George Dimas, Panagiota Gatoula, D. Iakovidis\",\"doi\":\"10.1109/ICRA48506.2021.9561211\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Salient object detection (SOD) can directly improve the performance of tasks like obstacle detection, semantic segmentation and object recognition. Such tasks are important for robotic and other autonomous navigation systems. State-of-the-art SOD methodologies, provide improved performance by incorporating depth information, usually acquired using additional specialized sensors, e.g., RGB-D cameras. This introduces an overhead to the overall cost and flexibility of such systems. Nevertheless, the recent advances of machine learning, have provided models, capable of generating depth map approximations, given a single RGB image. In this work, we propose a novel monocular SOD (MonoSOD) methodology, based on a two-branch CNN autoencoder architecture capable of predicting depth maps and estimating saliency through a trainable refinement scheme. Its application on benchmark datasets, indicates that its performance is comparable to that of state-of-the-art SOD methods relying on RGB-D data. Therefore, it could be considered as a lower-cost alternative of such methods for future applications.\",\"PeriodicalId\":108312,\"journal\":{\"name\":\"2021 IEEE International Conference on Robotics and Automation (ICRA)\",\"volume\":\"50 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-05-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE International Conference on Robotics and Automation (ICRA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICRA48506.2021.9561211\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Robotics and Automation (ICRA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICRA48506.2021.9561211","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

显著目标检测(SOD)可以直接提高障碍物检测、语义分割和目标识别等任务的性能。这些任务对于机器人和其他自主导航系统来说非常重要。最先进的超氧化物SOD方法通过整合深度信息提高了性能，通常使用额外的专用传感器(例如RGB-D相机)获取深度信息。这给此类系统的总体成本和灵活性带来了额外的开销。然而，机器学习的最新进展已经提供了能够生成深度图近似值的模型，给定单个RGB图像。在这项工作中，我们提出了一种新的单眼SOD (MonoSOD)方法，该方法基于两分支CNN自编码器架构，能够通过可训练的改进方案预测深度图并估计显著性。其在基准数据集上的应用表明，其性能可与依赖于RGB-D数据的最先进的SOD方法相媲美。因此，在未来的应用中，它可以被视为这些方法的低成本替代方案。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

MonoSOD: Monocular Salient Object Detection based on Predicted Depth

Salient object detection (SOD) can directly improve the performance of tasks like obstacle detection, semantic segmentation and object recognition. Such tasks are important for robotic and other autonomous navigation systems. State-of-the-art SOD methodologies, provide improved performance by incorporating depth information, usually acquired using additional specialized sensors, e.g., RGB-D cameras. This introduces an overhead to the overall cost and flexibility of such systems. Nevertheless, the recent advances of machine learning, have provided models, capable of generating depth map approximations, given a single RGB image. In this work, we propose a novel monocular SOD (MonoSOD) methodology, based on a two-branch CNN autoencoder architecture capable of predicting depth maps and estimating saliency through a trainable refinement scheme. Its application on benchmark datasets, indicates that its performance is comparable to that of state-of-the-art SOD methods relying on RGB-D data. Therefore, it could be considered as a lower-cost alternative of such methods for future applications.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2021 IEEE International Conference on Robotics and Automation (ICRA)

自引率

0.00%

发文量