{"title":"基于RGB室内场景图像融合估计深度和场景分割图的开放式监视导航辅助","authors":"Binoy Saha, Neha Shah, Sukhendu Das","doi":"10.1109/SPCOM55316.2022.9840820","DOIUrl":null,"url":null,"abstract":"Open-ended surveillance task for a robot in an unspecified environment using only an RGB camera, has not been addressed at length in literature. This is unlike the popular scenario of path planning where both the target and environments are often known. We focus on the task of a robot which needs to estimate a realistic depiction of the surrounding 3D environment, including the location of obstacles and free space to navigate in the scene within the view field. In this paper, we propose an unsupervised algorithm to iteratively compute an optimal direction for maximal unhindered movement in the scene. This task is challenging when presented with only a single RGB view of the scene, without the use of any online depth sensor. Our process combines cues from two deep-learning processes - semantic segmentation and depth map estimation, to automatically decide plausible robot movement paths while avoiding hindrance posed by objects in the scene. We make assumptions of the use of a low-end RGB USB camera, pre-set camera view direction (angle) and field of view, incremental movement of the robot in the view field, and iterative analysis of the scene, all catering to any open-ended (target-free) surveillance/patrolling applications. Inverse perspective geometry has been used to map the optimal direction estimated in the view field, to that on the floor of the scene for navigation. Results of evaluation using a dataset of videos of scenes captured from indoor (office, labs, meeting/class-rooms, corridors, lounge) environments, reveal the success of the proposed approach.","PeriodicalId":246982,"journal":{"name":"2022 IEEE International Conference on Signal Processing and Communications (SPCOM)","volume":"114 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Navigational Aid for Open-Ended Surveillance, by Fusing Estimated Depth and Scene Segmentation Maps, Using RGB Images of Indoor Scenes\",\"authors\":\"Binoy Saha, Neha Shah, Sukhendu Das\",\"doi\":\"10.1109/SPCOM55316.2022.9840820\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Open-ended surveillance task for a robot in an unspecified environment using only an RGB camera, has not been addressed at length in literature. This is unlike the popular scenario of path planning where both the target and environments are often known. We focus on the task of a robot which needs to estimate a realistic depiction of the surrounding 3D environment, including the location of obstacles and free space to navigate in the scene within the view field. In this paper, we propose an unsupervised algorithm to iteratively compute an optimal direction for maximal unhindered movement in the scene. This task is challenging when presented with only a single RGB view of the scene, without the use of any online depth sensor. Our process combines cues from two deep-learning processes - semantic segmentation and depth map estimation, to automatically decide plausible robot movement paths while avoiding hindrance posed by objects in the scene. We make assumptions of the use of a low-end RGB USB camera, pre-set camera view direction (angle) and field of view, incremental movement of the robot in the view field, and iterative analysis of the scene, all catering to any open-ended (target-free) surveillance/patrolling applications. Inverse perspective geometry has been used to map the optimal direction estimated in the view field, to that on the floor of the scene for navigation. Results of evaluation using a dataset of videos of scenes captured from indoor (office, labs, meeting/class-rooms, corridors, lounge) environments, reveal the success of the proposed approach.\",\"PeriodicalId\":246982,\"journal\":{\"name\":\"2022 IEEE International Conference on Signal Processing and Communications (SPCOM)\",\"volume\":\"114 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-07-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE International Conference on Signal Processing and Communications (SPCOM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SPCOM55316.2022.9840820\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Signal Processing and Communications (SPCOM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SPCOM55316.2022.9840820","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Navigational Aid for Open-Ended Surveillance, by Fusing Estimated Depth and Scene Segmentation Maps, Using RGB Images of Indoor Scenes
Open-ended surveillance task for a robot in an unspecified environment using only an RGB camera, has not been addressed at length in literature. This is unlike the popular scenario of path planning where both the target and environments are often known. We focus on the task of a robot which needs to estimate a realistic depiction of the surrounding 3D environment, including the location of obstacles and free space to navigate in the scene within the view field. In this paper, we propose an unsupervised algorithm to iteratively compute an optimal direction for maximal unhindered movement in the scene. This task is challenging when presented with only a single RGB view of the scene, without the use of any online depth sensor. Our process combines cues from two deep-learning processes - semantic segmentation and depth map estimation, to automatically decide plausible robot movement paths while avoiding hindrance posed by objects in the scene. We make assumptions of the use of a low-end RGB USB camera, pre-set camera view direction (angle) and field of view, incremental movement of the robot in the view field, and iterative analysis of the scene, all catering to any open-ended (target-free) surveillance/patrolling applications. Inverse perspective geometry has been used to map the optimal direction estimated in the view field, to that on the floor of the scene for navigation. Results of evaluation using a dataset of videos of scenes captured from indoor (office, labs, meeting/class-rooms, corridors, lounge) environments, reveal the success of the proposed approach.