Eirini Mathe, Alexandros Mitsou, E. Spyrou, Phivos Mylonas
{"title":"Arm Gesture Recognition using a Convolutional Neural Network","authors":"Eirini Mathe, Alexandros Mitsou, E. Spyrou, Phivos Mylonas","doi":"10.1109/SMAP.2018.8501886","DOIUrl":null,"url":null,"abstract":"In this paper we present an approach towards arm gesture recognition that uses a Convolutional Neural Network (CNN), which is trained on Discrete Fourier Transform (DFT) images that result from raw sensor readings. More specifically, we use the Kinect RGB and depth camera and we capture the 3D positions of a set of skeletal joints. From each joint we create a signal for each 3D coordinate and we concatenate those signals to create an image, the DFT of which is used to describe the gesture. We evaluate our approach using a dataset of hand gestures involving either one or both hands simultaneously and compare the proposed approach to another that uses hand-crafted features.","PeriodicalId":291905,"journal":{"name":"2018 13th International Workshop on Semantic and Social Media Adaptation and Personalization (SMAP)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 13th International Workshop on Semantic and Social Media Adaptation and Personalization (SMAP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SMAP.2018.8501886","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9
Abstract
In this paper we present an approach towards arm gesture recognition that uses a Convolutional Neural Network (CNN), which is trained on Discrete Fourier Transform (DFT) images that result from raw sensor readings. More specifically, we use the Kinect RGB and depth camera and we capture the 3D positions of a set of skeletal joints. From each joint we create a signal for each 3D coordinate and we concatenate those signals to create an image, the DFT of which is used to describe the gesture. We evaluate our approach using a dataset of hand gestures involving either one or both hands simultaneously and compare the proposed approach to another that uses hand-crafted features.