VSLink: A Fast and Pervasive Approach to Physical Cyber Space Interaction via Visual SLAM

2022 18th International Conference on Mobility, Sensing and Networking (MSN) Pub Date : 2022-12-01 DOI:10.1109/MSN57253.2022.00054

Han Zhou, Jiaming Huang, Hongchang Fan, Geng Ren, Yi Gao, Wei Dong

{"title":"VSLink: A Fast and Pervasive Approach to Physical Cyber Space Interaction via Visual SLAM","authors":"Han Zhou, Jiaming Huang, Hongchang Fan, Geng Ren, Yi Gao, Wei Dong","doi":"10.1109/MSN57253.2022.00054","DOIUrl":null,"url":null,"abstract":"With the fast growth of the Internet of Things, people now are surrounded by plenty of devices. To achieve efficient interaction with these devices, human-device interaction technologies are evolving. Because existing methods (mobile App) require users to remember the mapping between the real-world device and the digital one, an important point is to break such a gap. In this paper, we propose VSLink, which offers human-device interaction in an Augmented-Reality-like manner. VSLink achieves fast object identification and pervasive interaction for fusing the physical and cyberspace. To improve processing speed and accuracy, VSLink adopts a two-step object identification method to locate the interaction targets. In VSLink, visual SLAM and object detection neural networks detect stable/-movable objects separately, and detection prior from SLAM is sent to neural networks which enables sparse-convolution-based inference acceleration. VSLink offers a platform where the user could customize the interaction target, function, and interface. We evaluated VSLink in an environment containing multiple objects to interact with. The results showed that it achieves a 33% network inference acceleration on state-of-the-art networks, and enables object identification with 30FPS video input.","PeriodicalId":114459,"journal":{"name":"2022 18th International Conference on Mobility, Sensing and Networking (MSN)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 18th International Conference on Mobility, Sensing and Networking (MSN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MSN57253.2022.00054","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

With the fast growth of the Internet of Things, people now are surrounded by plenty of devices. To achieve efficient interaction with these devices, human-device interaction technologies are evolving. Because existing methods (mobile App) require users to remember the mapping between the real-world device and the digital one, an important point is to break such a gap. In this paper, we propose VSLink, which offers human-device interaction in an Augmented-Reality-like manner. VSLink achieves fast object identification and pervasive interaction for fusing the physical and cyberspace. To improve processing speed and accuracy, VSLink adopts a two-step object identification method to locate the interaction targets. In VSLink, visual SLAM and object detection neural networks detect stable/-movable objects separately, and detection prior from SLAM is sent to neural networks which enables sparse-convolution-based inference acceleration. VSLink offers a platform where the user could customize the interaction target, function, and interface. We evaluated VSLink in an environment containing multiple objects to interact with. The results showed that it achieves a 33% network inference acceleration on state-of-the-art networks, and enables object identification with 30FPS video input.

查看原文本刊更多论文

通过视觉SLAM实现物理网络空间交互的快速和普遍方法

随着物联网的快速发展，人们现在被大量的设备所包围。为了实现与这些设备的有效交互，人-设备交互技术正在不断发展。因为现有的方法(手机App)需要用户记住现实世界设备和数字设备之间的映射关系，所以打破这种差距是很重要的一点。在本文中，我们提出了VSLink，它以一种类似增强现实的方式提供人与设备的交互。VSLink实现了快速对象识别和普适交互，实现了物理和网络空间的融合。为了提高处理速度和精度，VSLink采用两步目标识别方法来定位交互目标。在VSLink中，视觉SLAM和目标检测神经网络分别检测稳定/不可移动的目标，SLAM的检测先验被发送到神经网络，从而实现基于稀疏卷积的推理加速。VSLink提供了一个用户可以自定义交互目标、功能和界面的平台。我们在包含多个要与之交互的对象的环境中评估VSLink。结果表明，它在最先进的网络上实现了33%的网络推理加速，并支持30FPS视频输入的目标识别。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 18th International Conference on Mobility, Sensing and Networking (MSN)

自引率

0.00%

发文量