RFID and camera fusion for recognition of human-object interactions

Proceedings of the 27th Annual International Conference on Mobile Computing and Networking Pub Date : 2021-10-19 DOI:10.1145/3447993.3483244

Xiulong Liu, Dongdong Liu, Jiuwu Zhang, Tao Gu, Keqiu Li

{"title":"RFID and camera fusion for recognition of human-object interactions","authors":"Xiulong Liu, Dongdong Liu, Jiuwu Zhang, Tao Gu, Keqiu Li","doi":"10.1145/3447993.3483244","DOIUrl":null,"url":null,"abstract":"Recognition of human-object interactions is practically important in various human-centric sensing scenarios such as smart supermarket, factory, and home. This paper proposes an RF-Camera system by fusing RFID and Computer Vision (CV) techniques, which is the first work to recognize the human gestural interactions with physical objects in multi-subject and multi-object scenarios. In RF-Camera, we first propose a dimension reduction method to transform the subject's 3D hand trajectory captured by depth camera to a 2D image, using which the subject's gesture can be recognized. We also propose a method to extract the facial image of target subject from an image that may contain irrelevant subjects, thereby further recognizing his/her identity. Finally, we model the physical movements of the held object's tag and further predict the tag phase data, by comparing which with real phase data of each tag human-object matching can be discovered. When implementing RF-Camera, three technical challenges need to be addressed. (i) To remove noisy data corresponding to irrelevant actions from raw sensing data, we propose a state transition diagram to determine the boundary of effective data. (ii) To predict phase data of the held target tag with unknown hand-tag offset, we quantify target tag trajectory by adding a variable hand-tag vector to captured hand trajectory. (iii) To ensure high reading rates of target tags in tag-dense scenarios, we propose a CV-assisted RFID scheduling method, in which analytics on CV data can help schedule RFID readings. We conduct extensive experiments to evaluate the performance of RF-Camera. Experimental results demonstrate that RF-Camera can recognize the gestural actions, human identity and human-object matching with an average accuracy higher than 90% in most cases.","PeriodicalId":177431,"journal":{"name":"Proceedings of the 27th Annual International Conference on Mobile Computing and Networking","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 27th Annual International Conference on Mobile Computing and Networking","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3447993.3483244","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 13

Abstract

Recognition of human-object interactions is practically important in various human-centric sensing scenarios such as smart supermarket, factory, and home. This paper proposes an RF-Camera system by fusing RFID and Computer Vision (CV) techniques, which is the first work to recognize the human gestural interactions with physical objects in multi-subject and multi-object scenarios. In RF-Camera, we first propose a dimension reduction method to transform the subject's 3D hand trajectory captured by depth camera to a 2D image, using which the subject's gesture can be recognized. We also propose a method to extract the facial image of target subject from an image that may contain irrelevant subjects, thereby further recognizing his/her identity. Finally, we model the physical movements of the held object's tag and further predict the tag phase data, by comparing which with real phase data of each tag human-object matching can be discovered. When implementing RF-Camera, three technical challenges need to be addressed. (i) To remove noisy data corresponding to irrelevant actions from raw sensing data, we propose a state transition diagram to determine the boundary of effective data. (ii) To predict phase data of the held target tag with unknown hand-tag offset, we quantify target tag trajectory by adding a variable hand-tag vector to captured hand trajectory. (iii) To ensure high reading rates of target tags in tag-dense scenarios, we propose a CV-assisted RFID scheduling method, in which analytics on CV data can help schedule RFID readings. We conduct extensive experiments to evaluate the performance of RF-Camera. Experimental results demonstrate that RF-Camera can recognize the gestural actions, human identity and human-object matching with an average accuracy higher than 90% in most cases.

查看原文本刊更多论文

RFID与相机融合用于人-物交互识别

在智能超市、工厂和家庭等各种以人为中心的传感场景中，人-物交互的识别具有重要的实际意义。本文提出了一种融合RFID和计算机视觉(CV)技术的射频相机系统，这是第一个在多主体和多目标场景下识别人体与物理物体的手势交互的工作。在RF-Camera中，我们首先提出了一种降维方法，将深度相机捕获的被测对象的3D手部轨迹转换为二维图像，利用二维图像识别被测对象的手势。我们还提出了一种从可能包含不相关主题的图像中提取目标受试者面部图像的方法，从而进一步识别其身份。最后，我们对所持物体的标签进行物理运动建模，并进一步预测标签的相位数据，通过与每个标签的真实相位数据进行比较，可以发现人-物匹配。在实施RF-Camera时，需要解决三个技术挑战。(i)为了从原始传感数据中去除不相关动作对应的噪声数据，我们提出了一个状态转移图来确定有效数据的边界。(ii)为了预测未知手标签偏移量的持有目标标签的相位数据，我们通过在捕获的手轨迹中添加可变手标签向量来量化目标标签轨迹。(iii)为了确保标签密集场景下目标标签的高读取率，我们提出了一种CV辅助RFID调度方法，该方法通过对CV数据的分析来帮助调度RFID读取。我们进行了大量的实验来评估射频相机的性能。实验结果表明，在大多数情况下，射频相机可以识别手势动作、人的身份和人物匹配，平均准确率高于90%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 27th Annual International Conference on Mobile Computing and Networking

自引率

0.00%

发文量