{"title":"遥测记录半监督分割评估虚拟环境的手势和动作发现","authors":"A. Batch, Kyungjun Lee, H. Maddali, N. Elmqvist","doi":"10.1109/AIVR.2018.00009","DOIUrl":null,"url":null,"abstract":"In this paper, we propose a novel pipeline for semi-supervised behavioral coding of videos of users testing a device or interface, with an eye toward human-computer interaction evaluation for virtual reality. Our system applies existing statistical techniques for time-series classification, including e-divisive change point detection and \"Symbolic Aggregate approXimation\" (SAX) with agglomerative hierarchical clustering, to 3D pose telemetry data. These techniques create classes of short segments of single-person video data–short actions of potential interest called \"micro-gestures.\" A long short-term memory (LSTM) layer then learns these micro-gestures from pose features generated purely from video via a pre-trained OpenPose convolutional neural network (CNN) to predict their occurrence in unlabeled test videos. We present and discuss the results from testing our system on the single user pose videos of the CMU Panoptic Dataset.","PeriodicalId":371868,"journal":{"name":"2018 IEEE International Conference on Artificial Intelligence and Virtual Reality (AIVR)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Gesture and Action Discovery for Evaluating Virtual Environments with Semi-Supervised Segmentation of Telemetry Records\",\"authors\":\"A. Batch, Kyungjun Lee, H. Maddali, N. Elmqvist\",\"doi\":\"10.1109/AIVR.2018.00009\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we propose a novel pipeline for semi-supervised behavioral coding of videos of users testing a device or interface, with an eye toward human-computer interaction evaluation for virtual reality. Our system applies existing statistical techniques for time-series classification, including e-divisive change point detection and \\\"Symbolic Aggregate approXimation\\\" (SAX) with agglomerative hierarchical clustering, to 3D pose telemetry data. These techniques create classes of short segments of single-person video data–short actions of potential interest called \\\"micro-gestures.\\\" A long short-term memory (LSTM) layer then learns these micro-gestures from pose features generated purely from video via a pre-trained OpenPose convolutional neural network (CNN) to predict their occurrence in unlabeled test videos. We present and discuss the results from testing our system on the single user pose videos of the CMU Panoptic Dataset.\",\"PeriodicalId\":371868,\"journal\":{\"name\":\"2018 IEEE International Conference on Artificial Intelligence and Virtual Reality (AIVR)\",\"volume\":\"2 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 IEEE International Conference on Artificial Intelligence and Virtual Reality (AIVR)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/AIVR.2018.00009\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE International Conference on Artificial Intelligence and Virtual Reality (AIVR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AIVR.2018.00009","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Gesture and Action Discovery for Evaluating Virtual Environments with Semi-Supervised Segmentation of Telemetry Records
In this paper, we propose a novel pipeline for semi-supervised behavioral coding of videos of users testing a device or interface, with an eye toward human-computer interaction evaluation for virtual reality. Our system applies existing statistical techniques for time-series classification, including e-divisive change point detection and "Symbolic Aggregate approXimation" (SAX) with agglomerative hierarchical clustering, to 3D pose telemetry data. These techniques create classes of short segments of single-person video data–short actions of potential interest called "micro-gestures." A long short-term memory (LSTM) layer then learns these micro-gestures from pose features generated purely from video via a pre-trained OpenPose convolutional neural network (CNN) to predict their occurrence in unlabeled test videos. We present and discuss the results from testing our system on the single user pose videos of the CMU Panoptic Dataset.