{"title":"Accelerating vision-based 3D indoor localization by distributing image processing over space and time","authors":"D. Yun, Hyunseok Chang, T. V. Lakshman","doi":"10.1145/2671015.2671018","DOIUrl":null,"url":null,"abstract":"In a vision-based 3D indoor localization system, conducting localization of user's device at a high frame rate is important to support real-time augment reality applications. However, vision-based 3D localization typically involves 2D keypoint detection and 2D-3D matching processes, which are in general too computationally intensive to be carried out at a high frame rate (e.g., 30 fps) on commodity hardware such as laptops or smartphones. In order to reduce per-frame computation time for 3D localization, we present a new method that distributes required computation over space and time, by splitting a video frame region into multiple sub-blocks, and processing only a sub-block in a rotating sequence at each video frame. The proposed method is general enough that it can be applied to any keypoint detection and 2D-3D matching schemes. We apply the method in a prototype 3D indoor localization system, and evaluate its performance in a 120m long indoor hallway environment using 5,200 video frames of 640x480 (VGA) resolution and a commodity laptop. When SIFT-based keypoint detection is used, our method reduces average and maximum computation time per frame by a factor of 10 and 7 respectively, with a marginal increase of positioning error (e.g., 0.17 m). This improvement enables the frame processing rate to increase from 3.2 fps to 23.3 fps.","PeriodicalId":93673,"journal":{"name":"Proceedings of the ACM Symposium on Virtual Reality Software and Technology. ACM Symposium on Virtual Reality Software and Technology","volume":"88 1","pages":"77-86"},"PeriodicalIF":0.0000,"publicationDate":"2014-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ACM Symposium on Virtual Reality Software and Technology. ACM Symposium on Virtual Reality Software and Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2671015.2671018","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
In a vision-based 3D indoor localization system, conducting localization of user's device at a high frame rate is important to support real-time augment reality applications. However, vision-based 3D localization typically involves 2D keypoint detection and 2D-3D matching processes, which are in general too computationally intensive to be carried out at a high frame rate (e.g., 30 fps) on commodity hardware such as laptops or smartphones. In order to reduce per-frame computation time for 3D localization, we present a new method that distributes required computation over space and time, by splitting a video frame region into multiple sub-blocks, and processing only a sub-block in a rotating sequence at each video frame. The proposed method is general enough that it can be applied to any keypoint detection and 2D-3D matching schemes. We apply the method in a prototype 3D indoor localization system, and evaluate its performance in a 120m long indoor hallway environment using 5,200 video frames of 640x480 (VGA) resolution and a commodity laptop. When SIFT-based keypoint detection is used, our method reduces average and maximum computation time per frame by a factor of 10 and 7 respectively, with a marginal increase of positioning error (e.g., 0.17 m). This improvement enables the frame processing rate to increase from 3.2 fps to 23.3 fps.