{"title":"改进了基于深度图的单幅室内场景几何识别方法","authors":"Yixian Liu, Xinyu Lin, Qianni Zhang, E. Izquierdo","doi":"10.1109/IVMSPW.2013.6611938","DOIUrl":null,"url":null,"abstract":"Interpreting 3D structure from 2D images is a constant problem to be solved in the field of computer vision. Prior work has been made to tackle this issue mainly in two different ways - depth estimation from multiple-view images based on geometric triangulation and depth reasoning from single image depending on monocular depth cues. Both solutions do not involve direct depth map information. In this work, we captured a RGBD dataset using Microsoft Kinect depth sensor. Approximate depth information is acquired as the fourth channel and employed as an extra reference for 3D scene geometry reasoning. It helps to achieve better estimation accuracy. We define nine basic geometric models for general indoor restricted-view scenes. Then we extract low/medium level colour and depth features from all four of the RGBD channels. Sequential Minimal Optimization SVM is used in this work as efficient classification tool. Experiments are implemented to compare the result of this approach with previous work that does not have the depth channel as input.","PeriodicalId":170714,"journal":{"name":"IVMSP 2013","volume":"308 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Improved indoor scene geometry recognition from single image based on depth map\",\"authors\":\"Yixian Liu, Xinyu Lin, Qianni Zhang, E. Izquierdo\",\"doi\":\"10.1109/IVMSPW.2013.6611938\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Interpreting 3D structure from 2D images is a constant problem to be solved in the field of computer vision. Prior work has been made to tackle this issue mainly in two different ways - depth estimation from multiple-view images based on geometric triangulation and depth reasoning from single image depending on monocular depth cues. Both solutions do not involve direct depth map information. In this work, we captured a RGBD dataset using Microsoft Kinect depth sensor. Approximate depth information is acquired as the fourth channel and employed as an extra reference for 3D scene geometry reasoning. It helps to achieve better estimation accuracy. We define nine basic geometric models for general indoor restricted-view scenes. Then we extract low/medium level colour and depth features from all four of the RGBD channels. Sequential Minimal Optimization SVM is used in this work as efficient classification tool. Experiments are implemented to compare the result of this approach with previous work that does not have the depth channel as input.\",\"PeriodicalId\":170714,\"journal\":{\"name\":\"IVMSP 2013\",\"volume\":\"308 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-06-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IVMSP 2013\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IVMSPW.2013.6611938\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IVMSP 2013","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IVMSPW.2013.6611938","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Improved indoor scene geometry recognition from single image based on depth map
Interpreting 3D structure from 2D images is a constant problem to be solved in the field of computer vision. Prior work has been made to tackle this issue mainly in two different ways - depth estimation from multiple-view images based on geometric triangulation and depth reasoning from single image depending on monocular depth cues. Both solutions do not involve direct depth map information. In this work, we captured a RGBD dataset using Microsoft Kinect depth sensor. Approximate depth information is acquired as the fourth channel and employed as an extra reference for 3D scene geometry reasoning. It helps to achieve better estimation accuracy. We define nine basic geometric models for general indoor restricted-view scenes. Then we extract low/medium level colour and depth features from all four of the RGBD channels. Sequential Minimal Optimization SVM is used in this work as efficient classification tool. Experiments are implemented to compare the result of this approach with previous work that does not have the depth channel as input.