通过视觉SLAM实现物理网络空间交互的快速和普遍方法

2022 18th International Conference on Mobility, Sensing and Networking (MSN) Pub Date : 2022-12-01 DOI:10.1109/MSN57253.2022.00054

Han Zhou, Jiaming Huang, Hongchang Fan, Geng Ren, Yi Gao, Wei Dong

{"title":"通过视觉SLAM实现物理网络空间交互的快速和普遍方法","authors":"Han Zhou, Jiaming Huang, Hongchang Fan, Geng Ren, Yi Gao, Wei Dong","doi":"10.1109/MSN57253.2022.00054","DOIUrl":null,"url":null,"abstract":"With the fast growth of the Internet of Things, people now are surrounded by plenty of devices. To achieve efficient interaction with these devices, human-device interaction technologies are evolving. Because existing methods (mobile App) require users to remember the mapping between the real-world device and the digital one, an important point is to break such a gap. In this paper, we propose VSLink, which offers human-device interaction in an Augmented-Reality-like manner. VSLink achieves fast object identification and pervasive interaction for fusing the physical and cyberspace. To improve processing speed and accuracy, VSLink adopts a two-step object identification method to locate the interaction targets. In VSLink, visual SLAM and object detection neural networks detect stable/-movable objects separately, and detection prior from SLAM is sent to neural networks which enables sparse-convolution-based inference acceleration. VSLink offers a platform where the user could customize the interaction target, function, and interface. We evaluated VSLink in an environment containing multiple objects to interact with. The results showed that it achieves a 33% network inference acceleration on state-of-the-art networks, and enables object identification with 30FPS video input.","PeriodicalId":114459,"journal":{"name":"2022 18th International Conference on Mobility, Sensing and Networking (MSN)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"VSLink: A Fast and Pervasive Approach to Physical Cyber Space Interaction via Visual SLAM\",\"authors\":\"Han Zhou, Jiaming Huang, Hongchang Fan, Geng Ren, Yi Gao, Wei Dong\",\"doi\":\"10.1109/MSN57253.2022.00054\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the fast growth of the Internet of Things, people now are surrounded by plenty of devices. To achieve efficient interaction with these devices, human-device interaction technologies are evolving. Because existing methods (mobile App) require users to remember the mapping between the real-world device and the digital one, an important point is to break such a gap. In this paper, we propose VSLink, which offers human-device interaction in an Augmented-Reality-like manner. VSLink achieves fast object identification and pervasive interaction for fusing the physical and cyberspace. To improve processing speed and accuracy, VSLink adopts a two-step object identification method to locate the interaction targets. In VSLink, visual SLAM and object detection neural networks detect stable/-movable objects separately, and detection prior from SLAM is sent to neural networks which enables sparse-convolution-based inference acceleration. VSLink offers a platform where the user could customize the interaction target, function, and interface. We evaluated VSLink in an environment containing multiple objects to interact with. The results showed that it achieves a 33% network inference acceleration on state-of-the-art networks, and enables object identification with 30FPS video input.\",\"PeriodicalId\":114459,\"journal\":{\"name\":\"2022 18th International Conference on Mobility, Sensing and Networking (MSN)\",\"volume\":\"41 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 18th International Conference on Mobility, Sensing and Networking (MSN)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MSN57253.2022.00054\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 18th International Conference on Mobility, Sensing and Networking (MSN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MSN57253.2022.00054","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

随着物联网的快速发展，人们现在被大量的设备所包围。为了实现与这些设备的有效交互，人-设备交互技术正在不断发展。因为现有的方法(手机App)需要用户记住现实世界设备和数字设备之间的映射关系，所以打破这种差距是很重要的一点。在本文中，我们提出了VSLink，它以一种类似增强现实的方式提供人与设备的交互。VSLink实现了快速对象识别和普适交互，实现了物理和网络空间的融合。为了提高处理速度和精度，VSLink采用两步目标识别方法来定位交互目标。在VSLink中，视觉SLAM和目标检测神经网络分别检测稳定/不可移动的目标，SLAM的检测先验被发送到神经网络，从而实现基于稀疏卷积的推理加速。VSLink提供了一个用户可以自定义交互目标、功能和界面的平台。我们在包含多个要与之交互的对象的环境中评估VSLink。结果表明，它在最先进的网络上实现了33%的网络推理加速，并支持30FPS视频输入的目标识别。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

VSLink: A Fast and Pervasive Approach to Physical Cyber Space Interaction via Visual SLAM

With the fast growth of the Internet of Things, people now are surrounded by plenty of devices. To achieve efficient interaction with these devices, human-device interaction technologies are evolving. Because existing methods (mobile App) require users to remember the mapping between the real-world device and the digital one, an important point is to break such a gap. In this paper, we propose VSLink, which offers human-device interaction in an Augmented-Reality-like manner. VSLink achieves fast object identification and pervasive interaction for fusing the physical and cyberspace. To improve processing speed and accuracy, VSLink adopts a two-step object identification method to locate the interaction targets. In VSLink, visual SLAM and object detection neural networks detect stable/-movable objects separately, and detection prior from SLAM is sent to neural networks which enables sparse-convolution-based inference acceleration. VSLink offers a platform where the user could customize the interaction target, function, and interface. We evaluated VSLink in an environment containing multiple objects to interact with. The results showed that it achieves a 33% network inference acceleration on state-of-the-art networks, and enables object identification with 30FPS video input.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 18th International Conference on Mobility, Sensing and Networking (MSN)

自引率

0.00%

发文量