Accelerating vision-based 3D indoor localization by distributing image processing over space and time

Proceedings of the ACM Symposium on Virtual Reality Software and Technology. ACM Symposium on Virtual Reality Software and Technology Pub Date : 2014-11-11 DOI:10.1145/2671015.2671018

D. Yun, Hyunseok Chang, T. V. Lakshman

{"title":"Accelerating vision-based 3D indoor localization by distributing image processing over space and time","authors":"D. Yun, Hyunseok Chang, T. V. Lakshman","doi":"10.1145/2671015.2671018","DOIUrl":null,"url":null,"abstract":"In a vision-based 3D indoor localization system, conducting localization of user's device at a high frame rate is important to support real-time augment reality applications. However, vision-based 3D localization typically involves 2D keypoint detection and 2D-3D matching processes, which are in general too computationally intensive to be carried out at a high frame rate (e.g., 30 fps) on commodity hardware such as laptops or smartphones. In order to reduce per-frame computation time for 3D localization, we present a new method that distributes required computation over space and time, by splitting a video frame region into multiple sub-blocks, and processing only a sub-block in a rotating sequence at each video frame. The proposed method is general enough that it can be applied to any keypoint detection and 2D-3D matching schemes. We apply the method in a prototype 3D indoor localization system, and evaluate its performance in a 120m long indoor hallway environment using 5,200 video frames of 640x480 (VGA) resolution and a commodity laptop. When SIFT-based keypoint detection is used, our method reduces average and maximum computation time per frame by a factor of 10 and 7 respectively, with a marginal increase of positioning error (e.g., 0.17 m). This improvement enables the frame processing rate to increase from 3.2 fps to 23.3 fps.","PeriodicalId":93673,"journal":{"name":"Proceedings of the ACM Symposium on Virtual Reality Software and Technology. ACM Symposium on Virtual Reality Software and Technology","volume":"88 1","pages":"77-86"},"PeriodicalIF":0.0000,"publicationDate":"2014-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ACM Symposium on Virtual Reality Software and Technology. ACM Symposium on Virtual Reality Software and Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2671015.2671018","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

In a vision-based 3D indoor localization system, conducting localization of user's device at a high frame rate is important to support real-time augment reality applications. However, vision-based 3D localization typically involves 2D keypoint detection and 2D-3D matching processes, which are in general too computationally intensive to be carried out at a high frame rate (e.g., 30 fps) on commodity hardware such as laptops or smartphones. In order to reduce per-frame computation time for 3D localization, we present a new method that distributes required computation over space and time, by splitting a video frame region into multiple sub-blocks, and processing only a sub-block in a rotating sequence at each video frame. The proposed method is general enough that it can be applied to any keypoint detection and 2D-3D matching schemes. We apply the method in a prototype 3D indoor localization system, and evaluate its performance in a 120m long indoor hallway environment using 5,200 video frames of 640x480 (VGA) resolution and a commodity laptop. When SIFT-based keypoint detection is used, our method reduces average and maximum computation time per frame by a factor of 10 and 7 respectively, with a marginal increase of positioning error (e.g., 0.17 m). This improvement enables the frame processing rate to increase from 3.2 fps to 23.3 fps.

查看原文本刊更多论文

通过在空间和时间上分布图像处理来加速基于视觉的3D室内定位

在基于视觉的三维室内定位系统中，以高帧率对用户设备进行定位对于支持实时增强现实应用非常重要。然而，基于视觉的3D定位通常涉及2D关键点检测和2D-3D匹配过程，这些过程通常计算量太大，无法在笔记本电脑或智能手机等商用硬件上以高帧率(例如30 fps)执行。为了减少3D定位的每帧计算时间，我们提出了一种新的方法，通过将视频帧区域分割成多个子块，并在每个视频帧的旋转序列中只处理一个子块，将所需的计算分配到空间和时间上。该方法具有通用性，可应用于任何关键点检测和2D-3D匹配方案。我们将该方法应用于一个3D室内定位系统原型中，并使用5200帧640x480 (VGA)分辨率的视频帧和一台普通笔记本电脑来评估其在120米长的室内走廊环境中的性能。当使用基于sift的关键点检测时，我们的方法将每帧的平均计算时间和最大计算时间分别减少了10倍和7倍，定位误差略有增加(例如0.17 m)，这使得帧处理速率从3.2 fps提高到23.3 fps。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the ACM Symposium on Virtual Reality Software and Technology. ACM Symposium on Virtual Reality Software and Technology

自引率

0.00%

发文量