Han Zhou, Jiaming Huang, Hongchang Fan, Geng Ren, Yi Gao, Wei Dong
{"title":"通过视觉SLAM实现物理网络空间交互的快速和普遍方法","authors":"Han Zhou, Jiaming Huang, Hongchang Fan, Geng Ren, Yi Gao, Wei Dong","doi":"10.1109/MSN57253.2022.00054","DOIUrl":null,"url":null,"abstract":"With the fast growth of the Internet of Things, people now are surrounded by plenty of devices. To achieve efficient interaction with these devices, human-device interaction technologies are evolving. Because existing methods (mobile App) require users to remember the mapping between the real-world device and the digital one, an important point is to break such a gap. In this paper, we propose VSLink, which offers human-device interaction in an Augmented-Reality-like manner. VSLink achieves fast object identification and pervasive interaction for fusing the physical and cyberspace. To improve processing speed and accuracy, VSLink adopts a two-step object identification method to locate the interaction targets. In VSLink, visual SLAM and object detection neural networks detect stable/-movable objects separately, and detection prior from SLAM is sent to neural networks which enables sparse-convolution-based inference acceleration. VSLink offers a platform where the user could customize the interaction target, function, and interface. We evaluated VSLink in an environment containing multiple objects to interact with. The results showed that it achieves a 33% network inference acceleration on state-of-the-art networks, and enables object identification with 30FPS video input.","PeriodicalId":114459,"journal":{"name":"2022 18th International Conference on Mobility, Sensing and Networking (MSN)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"VSLink: A Fast and Pervasive Approach to Physical Cyber Space Interaction via Visual SLAM\",\"authors\":\"Han Zhou, Jiaming Huang, Hongchang Fan, Geng Ren, Yi Gao, Wei Dong\",\"doi\":\"10.1109/MSN57253.2022.00054\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the fast growth of the Internet of Things, people now are surrounded by plenty of devices. To achieve efficient interaction with these devices, human-device interaction technologies are evolving. Because existing methods (mobile App) require users to remember the mapping between the real-world device and the digital one, an important point is to break such a gap. In this paper, we propose VSLink, which offers human-device interaction in an Augmented-Reality-like manner. VSLink achieves fast object identification and pervasive interaction for fusing the physical and cyberspace. To improve processing speed and accuracy, VSLink adopts a two-step object identification method to locate the interaction targets. In VSLink, visual SLAM and object detection neural networks detect stable/-movable objects separately, and detection prior from SLAM is sent to neural networks which enables sparse-convolution-based inference acceleration. VSLink offers a platform where the user could customize the interaction target, function, and interface. We evaluated VSLink in an environment containing multiple objects to interact with. The results showed that it achieves a 33% network inference acceleration on state-of-the-art networks, and enables object identification with 30FPS video input.\",\"PeriodicalId\":114459,\"journal\":{\"name\":\"2022 18th International Conference on Mobility, Sensing and Networking (MSN)\",\"volume\":\"41 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 18th International Conference on Mobility, Sensing and Networking (MSN)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MSN57253.2022.00054\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 18th International Conference on Mobility, Sensing and Networking (MSN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MSN57253.2022.00054","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
VSLink: A Fast and Pervasive Approach to Physical Cyber Space Interaction via Visual SLAM
With the fast growth of the Internet of Things, people now are surrounded by plenty of devices. To achieve efficient interaction with these devices, human-device interaction technologies are evolving. Because existing methods (mobile App) require users to remember the mapping between the real-world device and the digital one, an important point is to break such a gap. In this paper, we propose VSLink, which offers human-device interaction in an Augmented-Reality-like manner. VSLink achieves fast object identification and pervasive interaction for fusing the physical and cyberspace. To improve processing speed and accuracy, VSLink adopts a two-step object identification method to locate the interaction targets. In VSLink, visual SLAM and object detection neural networks detect stable/-movable objects separately, and detection prior from SLAM is sent to neural networks which enables sparse-convolution-based inference acceleration. VSLink offers a platform where the user could customize the interaction target, function, and interface. We evaluated VSLink in an environment containing multiple objects to interact with. The results showed that it achieves a 33% network inference acceleration on state-of-the-art networks, and enables object identification with 30FPS video input.