{"title":"Identification of the Driver's Interest Point using a Head Pose Trajectory for Situated Dialog Systems","authors":"Young-Ho Kim, Teruhisa Misu","doi":"10.1145/2663204.2663230","DOIUrl":null,"url":null,"abstract":"This paper addresses issues existing in situated language understanding in a moving car. Particularly, we propose a method for understanding user queries regarding specific target buildings in their surroundings based on the driver's head pose and speech information. To identify a meaningful head pose motion related to the user query that is among spontaneous motions while driving, we construct a model describing the relationship between sequences of a driver's head pose and the relative direction to an interest point using the Gaussian process regression. We also consider time-varying interest point using kernel density estimation. We collected situated queries from subject drivers by using our research system embedded in a real car. The proposed method achieves an improvement in the target identification rate by 14% in the user-independent training condition and 27% in the user-dependent training condition over the method that uses the head motion at the start-of-speech timing.","PeriodicalId":389037,"journal":{"name":"Proceedings of the 16th International Conference on Multimodal Interaction","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 16th International Conference on Multimodal Interaction","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2663204.2663230","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
This paper addresses issues existing in situated language understanding in a moving car. Particularly, we propose a method for understanding user queries regarding specific target buildings in their surroundings based on the driver's head pose and speech information. To identify a meaningful head pose motion related to the user query that is among spontaneous motions while driving, we construct a model describing the relationship between sequences of a driver's head pose and the relative direction to an interest point using the Gaussian process regression. We also consider time-varying interest point using kernel density estimation. We collected situated queries from subject drivers by using our research system embedded in a real car. The proposed method achieves an improvement in the target identification rate by 14% in the user-independent training condition and 27% in the user-dependent training condition over the method that uses the head motion at the start-of-speech timing.