VSLink: A Fast and Pervasive Approach to Physical Cyber Space Interaction via Visual SLAM

Han Zhou, Jiaming Huang, Hongchang Fan, Geng Ren, Yi Gao, Wei Dong
{"title":"VSLink: A Fast and Pervasive Approach to Physical Cyber Space Interaction via Visual SLAM","authors":"Han Zhou, Jiaming Huang, Hongchang Fan, Geng Ren, Yi Gao, Wei Dong","doi":"10.1109/MSN57253.2022.00054","DOIUrl":null,"url":null,"abstract":"With the fast growth of the Internet of Things, people now are surrounded by plenty of devices. To achieve efficient interaction with these devices, human-device interaction technologies are evolving. Because existing methods (mobile App) require users to remember the mapping between the real-world device and the digital one, an important point is to break such a gap. In this paper, we propose VSLink, which offers human-device interaction in an Augmented-Reality-like manner. VSLink achieves fast object identification and pervasive interaction for fusing the physical and cyberspace. To improve processing speed and accuracy, VSLink adopts a two-step object identification method to locate the interaction targets. In VSLink, visual SLAM and object detection neural networks detect stable/-movable objects separately, and detection prior from SLAM is sent to neural networks which enables sparse-convolution-based inference acceleration. VSLink offers a platform where the user could customize the interaction target, function, and interface. We evaluated VSLink in an environment containing multiple objects to interact with. The results showed that it achieves a 33% network inference acceleration on state-of-the-art networks, and enables object identification with 30FPS video input.","PeriodicalId":114459,"journal":{"name":"2022 18th International Conference on Mobility, Sensing and Networking (MSN)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 18th International Conference on Mobility, Sensing and Networking (MSN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MSN57253.2022.00054","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

With the fast growth of the Internet of Things, people now are surrounded by plenty of devices. To achieve efficient interaction with these devices, human-device interaction technologies are evolving. Because existing methods (mobile App) require users to remember the mapping between the real-world device and the digital one, an important point is to break such a gap. In this paper, we propose VSLink, which offers human-device interaction in an Augmented-Reality-like manner. VSLink achieves fast object identification and pervasive interaction for fusing the physical and cyberspace. To improve processing speed and accuracy, VSLink adopts a two-step object identification method to locate the interaction targets. In VSLink, visual SLAM and object detection neural networks detect stable/-movable objects separately, and detection prior from SLAM is sent to neural networks which enables sparse-convolution-based inference acceleration. VSLink offers a platform where the user could customize the interaction target, function, and interface. We evaluated VSLink in an environment containing multiple objects to interact with. The results showed that it achieves a 33% network inference acceleration on state-of-the-art networks, and enables object identification with 30FPS video input.
通过视觉SLAM实现物理网络空间交互的快速和普遍方法
随着物联网的快速发展,人们现在被大量的设备所包围。为了实现与这些设备的有效交互,人-设备交互技术正在不断发展。因为现有的方法(手机App)需要用户记住现实世界设备和数字设备之间的映射关系,所以打破这种差距是很重要的一点。在本文中,我们提出了VSLink,它以一种类似增强现实的方式提供人与设备的交互。VSLink实现了快速对象识别和普适交互,实现了物理和网络空间的融合。为了提高处理速度和精度,VSLink采用两步目标识别方法来定位交互目标。在VSLink中,视觉SLAM和目标检测神经网络分别检测稳定/不可移动的目标,SLAM的检测先验被发送到神经网络,从而实现基于稀疏卷积的推理加速。VSLink提供了一个用户可以自定义交互目标、功能和界面的平台。我们在包含多个要与之交互的对象的环境中评估VSLink。结果表明,它在最先进的网络上实现了33%的网络推理加速,并支持30FPS视频输入的目标识别。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信