Yanli Ji, Y. Ko, Atsushi Shimada, H. Nagahara, R. Taniguchi
{"title":"基于局部特征和深度图像的烹饪手势识别","authors":"Yanli Ji, Y. Ko, Atsushi Shimada, H. Nagahara, R. Taniguchi","doi":"10.1145/2390776.2390785","DOIUrl":null,"url":null,"abstract":"In this paper, we propose a method combining visual local features and depth image information to recognize cooking gestures. We employ the feature calculation method[2] which used extended FAST detector and a compact descriptor CHOG3D to calculate visual local features. We pack the local features by BoW in frame sequences to represent the cooking gestures. In addition, the depth images of hands gestures are extracted and integrated spatio-temporally to represent the position and trajectory information of cooking gestures. The two kinds of features are used to describe cooking gestures, and recognition is realized by employing the SVM. In our method, we determine the gesture class for each frame in cooking sequences. By analyzing the results of frames, we recognize cooking gestures in a continue frame sequences of cooking menus, and find the temporal positions of the recognized gestures.","PeriodicalId":91851,"journal":{"name":"CEA'13 : proceedings of the 5th International Workshop on Multimedia for Cooking & Eating Activities : October 21, 2013, Barcelona, Spain. Workshop on Multimedia for Cooking and Eating Activities (5th : 2013 : Barcelona, Spain)","volume":"25 1","pages":"37-42"},"PeriodicalIF":0.0000,"publicationDate":"2012-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":"{\"title\":\"Cooking gesture recognition using local feature and depth image\",\"authors\":\"Yanli Ji, Y. Ko, Atsushi Shimada, H. Nagahara, R. Taniguchi\",\"doi\":\"10.1145/2390776.2390785\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we propose a method combining visual local features and depth image information to recognize cooking gestures. We employ the feature calculation method[2] which used extended FAST detector and a compact descriptor CHOG3D to calculate visual local features. We pack the local features by BoW in frame sequences to represent the cooking gestures. In addition, the depth images of hands gestures are extracted and integrated spatio-temporally to represent the position and trajectory information of cooking gestures. The two kinds of features are used to describe cooking gestures, and recognition is realized by employing the SVM. In our method, we determine the gesture class for each frame in cooking sequences. By analyzing the results of frames, we recognize cooking gestures in a continue frame sequences of cooking menus, and find the temporal positions of the recognized gestures.\",\"PeriodicalId\":91851,\"journal\":{\"name\":\"CEA'13 : proceedings of the 5th International Workshop on Multimedia for Cooking & Eating Activities : October 21, 2013, Barcelona, Spain. Workshop on Multimedia for Cooking and Eating Activities (5th : 2013 : Barcelona, Spain)\",\"volume\":\"25 1\",\"pages\":\"37-42\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-11-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"13\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"CEA'13 : proceedings of the 5th International Workshop on Multimedia for Cooking & Eating Activities : October 21, 2013, Barcelona, Spain. Workshop on Multimedia for Cooking and Eating Activities (5th : 2013 : Barcelona, Spain)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2390776.2390785\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"CEA'13 : proceedings of the 5th International Workshop on Multimedia for Cooking & Eating Activities : October 21, 2013, Barcelona, Spain. Workshop on Multimedia for Cooking and Eating Activities (5th : 2013 : Barcelona, Spain)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2390776.2390785","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Cooking gesture recognition using local feature and depth image
In this paper, we propose a method combining visual local features and depth image information to recognize cooking gestures. We employ the feature calculation method[2] which used extended FAST detector and a compact descriptor CHOG3D to calculate visual local features. We pack the local features by BoW in frame sequences to represent the cooking gestures. In addition, the depth images of hands gestures are extracted and integrated spatio-temporally to represent the position and trajectory information of cooking gestures. The two kinds of features are used to describe cooking gestures, and recognition is realized by employing the SVM. In our method, we determine the gesture class for each frame in cooking sequences. By analyzing the results of frames, we recognize cooking gestures in a continue frame sequences of cooking menus, and find the temporal positions of the recognized gestures.