Deep Learning Based Machine Vision: First Steps Towards a Hand Gesture Recognition Set Up for Collaborative Robots

Cristina Nuzzi, S. Pasinetti, M. Lancini, F. Docchio, G. Sansoni
{"title":"Deep Learning Based Machine Vision: First Steps Towards a Hand Gesture Recognition Set Up for Collaborative Robots","authors":"Cristina Nuzzi, S. Pasinetti, M. Lancini, F. Docchio, G. Sansoni","doi":"10.1109/METROI4.2018.8439044","DOIUrl":null,"url":null,"abstract":"In this paper, we present a smart hand gesture recognition experimental set up for collaborative robots using a Faster R-CNN object detector to find the accurate position of the hands in the RGB images taken from a Kinect v2 camera. We used MATLAB to code the detector and a purposely designed function for the prediction phase, necessary for detecting static gestures in the way we have defined them. We performed a number of experiments with different datasets to evaluate the performances of the model in different situations: a basic hand gestures dataset with four gestures performed by the combination of both hands, a dataset where the actors wear skin-like color clothes while performing the gestures, a dataset where the actors wear light-blue gloves and a dataset similar to the first one but with the camera placed close to the operator. The same tests have been conducted in a situation where also the face of the operator was detected by the algorithm, in order to improve the prediction accuracy. Our experiments show that the best model accuracy and Fl-Score are achieved by the complete model without the face detection. We tested the model in real-time, achieving good performances that can lead to real-time human-robot interaction, being the inference time around 0.2 seconds.","PeriodicalId":396967,"journal":{"name":"2018 Workshop on Metrology for Industry 4.0 and IoT","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 Workshop on Metrology for Industry 4.0 and IoT","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/METROI4.2018.8439044","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 17

Abstract

In this paper, we present a smart hand gesture recognition experimental set up for collaborative robots using a Faster R-CNN object detector to find the accurate position of the hands in the RGB images taken from a Kinect v2 camera. We used MATLAB to code the detector and a purposely designed function for the prediction phase, necessary for detecting static gestures in the way we have defined them. We performed a number of experiments with different datasets to evaluate the performances of the model in different situations: a basic hand gestures dataset with four gestures performed by the combination of both hands, a dataset where the actors wear skin-like color clothes while performing the gestures, a dataset where the actors wear light-blue gloves and a dataset similar to the first one but with the camera placed close to the operator. The same tests have been conducted in a situation where also the face of the operator was detected by the algorithm, in order to improve the prediction accuracy. Our experiments show that the best model accuracy and Fl-Score are achieved by the complete model without the face detection. We tested the model in real-time, achieving good performances that can lead to real-time human-robot interaction, being the inference time around 0.2 seconds.
基于深度学习的机器视觉:迈向协作机器人手势识别的第一步
在本文中,我们提出了一个用于协作机器人的智能手势识别实验设置,该实验使用Faster R-CNN对象检测器在Kinect v2相机拍摄的RGB图像中找到手的准确位置。我们使用MATLAB对检测器和一个专门为预测阶段设计的函数进行编码,这对于以我们定义的方式检测静态手势是必要的。我们用不同的数据集进行了大量的实验,以评估模型在不同情况下的性能:一个基本的手势数据集,由双手组合执行四种手势,一个数据集,演员在执行手势时穿着类似皮肤的颜色的衣服,一个数据集,演员戴着浅蓝色的手套,一个数据集类似于第一个数据集,但相机靠近操作员。为了提高预测精度,还在算法检测操作员面部的情况下进行了相同的测试。我们的实验表明,在没有人脸检测的情况下,完整的模型获得了最好的模型精度和Fl-Score。我们对模型进行了实时测试,获得了良好的性能,可以实现实时人机交互,推理时间约为0.2秒。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信