Alvaro Marcos-Ramiro, Daniel Pizarro-Perez, Marta Marrón Romera, D. Gática-Pérez
{"title":"捕捉谈话中的上半身动作:一种外观准不变的方法","authors":"Alvaro Marcos-Ramiro, Daniel Pizarro-Perez, Marta Marrón Romera, D. Gática-Pérez","doi":"10.1145/2663204.2663267","DOIUrl":null,"url":null,"abstract":"We address the problem of body communication retrieval and measuring in seated conversations by means of markerless motion capture. In psychological studies, the use of automatic methods is key to reduce the subjectivity present in manual behavioral coding used to extract these cues. These studies usually involve hundreds of subjects with different clothing, non-acted poses, or different distances to the camera in uncalibrated, RGB-only video. However, range cameras are not yet common in psychology research, especially in existing recordings. Therefore, it becomes highly relevant to develop a fast method that is able to work in these conditions. Given the known relationship between depth and motion estimates, we propose to robustly integrate highly appearance-invariant image motion features in a machine learning approach, complemented with an effective tracking scheme. We evaluate the method's performance with existing databases and a database of upper body poses displayed in job interviews that we make public, showing that in our scenario it is comparable to that of Kinect without using a range camera, and state-of-the-art w.r.t. the HumanEva and ChaLearn 2011 evaluation datasets.","PeriodicalId":389037,"journal":{"name":"Proceedings of the 16th International Conference on Multimodal Interaction","volume":"51 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Capturing Upper Body Motion in Conversation: An Appearance Quasi-Invariant Approach\",\"authors\":\"Alvaro Marcos-Ramiro, Daniel Pizarro-Perez, Marta Marrón Romera, D. Gática-Pérez\",\"doi\":\"10.1145/2663204.2663267\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We address the problem of body communication retrieval and measuring in seated conversations by means of markerless motion capture. In psychological studies, the use of automatic methods is key to reduce the subjectivity present in manual behavioral coding used to extract these cues. These studies usually involve hundreds of subjects with different clothing, non-acted poses, or different distances to the camera in uncalibrated, RGB-only video. However, range cameras are not yet common in psychology research, especially in existing recordings. Therefore, it becomes highly relevant to develop a fast method that is able to work in these conditions. Given the known relationship between depth and motion estimates, we propose to robustly integrate highly appearance-invariant image motion features in a machine learning approach, complemented with an effective tracking scheme. We evaluate the method's performance with existing databases and a database of upper body poses displayed in job interviews that we make public, showing that in our scenario it is comparable to that of Kinect without using a range camera, and state-of-the-art w.r.t. the HumanEva and ChaLearn 2011 evaluation datasets.\",\"PeriodicalId\":389037,\"journal\":{\"name\":\"Proceedings of the 16th International Conference on Multimodal Interaction\",\"volume\":\"51 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-11-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 16th International Conference on Multimodal Interaction\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2663204.2663267\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 16th International Conference on Multimodal Interaction","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2663204.2663267","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Capturing Upper Body Motion in Conversation: An Appearance Quasi-Invariant Approach
We address the problem of body communication retrieval and measuring in seated conversations by means of markerless motion capture. In psychological studies, the use of automatic methods is key to reduce the subjectivity present in manual behavioral coding used to extract these cues. These studies usually involve hundreds of subjects with different clothing, non-acted poses, or different distances to the camera in uncalibrated, RGB-only video. However, range cameras are not yet common in psychology research, especially in existing recordings. Therefore, it becomes highly relevant to develop a fast method that is able to work in these conditions. Given the known relationship between depth and motion estimates, we propose to robustly integrate highly appearance-invariant image motion features in a machine learning approach, complemented with an effective tracking scheme. We evaluate the method's performance with existing databases and a database of upper body poses displayed in job interviews that we make public, showing that in our scenario it is comparable to that of Kinect without using a range camera, and state-of-the-art w.r.t. the HumanEva and ChaLearn 2011 evaluation datasets.