{"title":"Can't hide your disappointment: Using human pose and facial cues for intent prediction in a target game","authors":"Vidullan Surendran, Alan R. Wagner","doi":"10.1109/ARSO51874.2021.9541546","DOIUrl":null,"url":null,"abstract":"Recognising intent in collaborative human robot tasks can improve team performance and human perception of the robot. Tasks that involve dynamic physical motions increase the likelihood of mistakes committed by the human. When a mistake is made, the observed outcome can differ from the intent of the human. We setup a throwing task consisting of 9 targets, and propose a method that can predict human intent in the presence of mistakes. This method uses a vision based pipeline to predict the outcome of the throw, determine if the subject's emotional reaction to the outcome indicates incongruence between the intent and outcome, and finally predict the intent of the human in the context of the throwing task. We show that the use of human pose improves outcome prediction accuracy to 28% compared to prior work that uses a two-stream architecture to achieve 22%. The method is also able to predict intent-outcome congruence accurately in 75% of the cases. Since the prediction of intent in the presence of mistakes is currently understudied, we compare against a random baseline of 11% and find that the end-to-end intent recognition pipeline achieves an accuracy of 23%.","PeriodicalId":156296,"journal":{"name":"2021 IEEE International Conference on Advanced Robotics and Its Social Impacts (ARSO)","volume":"151 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Advanced Robotics and Its Social Impacts (ARSO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ARSO51874.2021.9541546","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Recognising intent in collaborative human robot tasks can improve team performance and human perception of the robot. Tasks that involve dynamic physical motions increase the likelihood of mistakes committed by the human. When a mistake is made, the observed outcome can differ from the intent of the human. We setup a throwing task consisting of 9 targets, and propose a method that can predict human intent in the presence of mistakes. This method uses a vision based pipeline to predict the outcome of the throw, determine if the subject's emotional reaction to the outcome indicates incongruence between the intent and outcome, and finally predict the intent of the human in the context of the throwing task. We show that the use of human pose improves outcome prediction accuracy to 28% compared to prior work that uses a two-stream architecture to achieve 22%. The method is also able to predict intent-outcome congruence accurately in 75% of the cases. Since the prediction of intent in the presence of mistakes is currently understudied, we compare against a random baseline of 11% and find that the end-to-end intent recognition pipeline achieves an accuracy of 23%.