Proceedings of the 16th International Conference on Multimodal Interaction最新文献

筛选
英文 中文
Identification of the Driver's Interest Point using a Head Pose Trajectory for Situated Dialog Systems 使用头部姿态轨迹识别驾驶员的兴趣点
Proceedings of the 16th International Conference on Multimodal Interaction Pub Date : 2014-11-12 DOI: 10.1145/2663204.2663230
Young-Ho Kim, Teruhisa Misu
{"title":"Identification of the Driver's Interest Point using a Head Pose Trajectory for Situated Dialog Systems","authors":"Young-Ho Kim, Teruhisa Misu","doi":"10.1145/2663204.2663230","DOIUrl":"https://doi.org/10.1145/2663204.2663230","url":null,"abstract":"This paper addresses issues existing in situated language understanding in a moving car. Particularly, we propose a method for understanding user queries regarding specific target buildings in their surroundings based on the driver's head pose and speech information. To identify a meaningful head pose motion related to the user query that is among spontaneous motions while driving, we construct a model describing the relationship between sequences of a driver's head pose and the relative direction to an interest point using the Gaussian process regression. We also consider time-varying interest point using kernel density estimation. We collected situated queries from subject drivers by using our research system embedded in a real car. The proposed method achieves an improvement in the target identification rate by 14% in the user-independent training condition and 27% in the user-dependent training condition over the method that uses the head motion at the start-of-speech timing.","PeriodicalId":389037,"journal":{"name":"Proceedings of the 16th International Conference on Multimodal Interaction","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122725402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Neural Networks for Emotion Recognition in the Wild 野外情绪识别的神经网络
Proceedings of the 16th International Conference on Multimodal Interaction Pub Date : 2014-11-12 DOI: 10.1145/2663204.2666270
Michal Grosicki
{"title":"Neural Networks for Emotion Recognition in the Wild","authors":"Michal Grosicki","doi":"10.1145/2663204.2666270","DOIUrl":"https://doi.org/10.1145/2663204.2666270","url":null,"abstract":"In this paper we present neural networks based method for emotion recognition. Proposed model was developed as part of 2014 Emotion Recognition in the Wild Challenge. It is composed of modality specific neural networks, which where trained separately on audio and video data extracted from short video clips taken from various movies. Each network was trained on frame-level data, which in later stages were aggregated by simple averaging of predicted class distributions for each clip. In the next stage various techniques for combining modalities where investigated with the best being support vector machine with RBF kernel. Our method achieved accuracy of 37.84%, which is better than 33.7% obtained by the best baseline model provided by organisers.","PeriodicalId":389037,"journal":{"name":"Proceedings of the 16th International Conference on Multimodal Interaction","volume":"116 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133701945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
MLA'14: Third Multimodal Learning Analytics Workshop and Grand Challenges MLA'14:第三届多模式学习分析研讨会和重大挑战
Proceedings of the 16th International Conference on Multimodal Interaction Pub Date : 2014-11-12 DOI: 10.1145/2663204.2668318
X. Ochoa, M. Worsley, K. Chiluiza, S. Luz
{"title":"MLA'14: Third Multimodal Learning Analytics Workshop and Grand Challenges","authors":"X. Ochoa, M. Worsley, K. Chiluiza, S. Luz","doi":"10.1145/2663204.2668318","DOIUrl":"https://doi.org/10.1145/2663204.2668318","url":null,"abstract":"This paper summarizes the third Multimodal Learning Analytics Workshop and Grand Challenges (MLA'14). This subfield of Learning Analytics focuses on the interpretation of the multimodal interactions that occurs in learning environments, both digital and physical. This is a hybrid event that includes presentations about methods and techniques to analyze and merge the different signals captured from these environments (workshop session) and more concrete results from the application of Multimodal Learning Analytics techniques to predict the performance of students while solving math problems or presenting in the classroom (challenges sessions). A total of eight articles will be presented in this event. The main conclusion from this event is that Multimodal Learning Analytics is a desirable research endeavour that could produce results that can be currently applied to improve the learning process.","PeriodicalId":389037,"journal":{"name":"Proceedings of the 16th International Conference on Multimodal Interaction","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133811427","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
CrossMotion: Fusing Device and Image Motion for User Identification, Tracking and Device Association 交叉运动:用于用户识别、跟踪和设备关联的融合设备和图像运动
Proceedings of the 16th International Conference on Multimodal Interaction Pub Date : 2014-11-12 DOI: 10.1145/2663204.2663270
Andrew D. Wilson, Hrvoje Benko
{"title":"CrossMotion: Fusing Device and Image Motion for User Identification, Tracking and Device Association","authors":"Andrew D. Wilson, Hrvoje Benko","doi":"10.1145/2663204.2663270","DOIUrl":"https://doi.org/10.1145/2663204.2663270","url":null,"abstract":"Identifying and tracking people and mobile devices indoors has many applications, but is still a challenging problem. We introduce a cross-modal sensor fusion approach to track mobile devices and the users carrying them. The CrossMotion technique matches the acceleration of a mobile device, as measured by an onboard internal measurement unit, to similar acceleration observed in the infrared and depth images of a Microsoft Kinect v2 camera. This matching process is conceptually simple and avoids many of the difficulties typical of more common appearance-based approaches. In particular, CrossMotion does not require a model of the appearance of either the user or the device, nor in many cases a direct line of sight to the device. We demonstrate a real time implementation that can be applied to many ubiquitous computing scenarios. In our experiments, CrossMotion found the person's body 99% of the time, on average within 7cm of a reference device position.","PeriodicalId":389037,"journal":{"name":"Proceedings of the 16th International Conference on Multimodal Interaction","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129271619","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 32
Analysis of Respiration for Prediction of "Who Will Be Next Speaker and When?" in Multi-Party Meetings 多党会议中“谁是下一位发言人,何时发言”预测的呼吸分析
Proceedings of the 16th International Conference on Multimodal Interaction Pub Date : 2014-11-12 DOI: 10.1145/2663204.2663271
Ryo Ishii, K. Otsuka, Shiro Kumano, Junji Yamato
{"title":"Analysis of Respiration for Prediction of \"Who Will Be Next Speaker and When?\" in Multi-Party Meetings","authors":"Ryo Ishii, K. Otsuka, Shiro Kumano, Junji Yamato","doi":"10.1145/2663204.2663271","DOIUrl":"https://doi.org/10.1145/2663204.2663271","url":null,"abstract":"To build a model for predicting the next speaker and the start time of the next utterance in multi-party meetings, we performed a fundamental study of how respiration could be effective for the prediction model. The results of the analysis reveal that a speaker inhales more rapidly and quickly right after the end of a unit of utterance in turn-keeping. The next speaker takes a bigger breath toward speaking in turn-changing than listeners who will not become the next speaker. Based on the results of the analysis, we constructed the prediction models to evaluate how effective the parameters are. The results of the evaluation suggest that the speaker's inhalation right after a unit of utterance, such as the start time from the end of the unit of utterance and the slope and duration of the inhalation phase, is effective for predicting whether turn-keeping or turn-changing happen about 350 ms before the start time of the next utterance on average and that listener's inhalation before the next utterance, such as the maximal inspiration and amplitude of the inhalation phase, is effective for predicting the next speaker in turn-changing about 900 ms before the start time of the next utterance on average.","PeriodicalId":389037,"journal":{"name":"Proceedings of the 16th International Conference on Multimodal Interaction","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127641867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 45
A Multimodal Context-based Approach for Distress Assessment 一种基于情境的多模式痛苦评估方法
Proceedings of the 16th International Conference on Multimodal Interaction Pub Date : 2014-11-12 DOI: 10.1145/2663204.2663274
Sayan Ghosh, Moitreya Chatterjee, Louis-Philippe Morency
{"title":"A Multimodal Context-based Approach for Distress Assessment","authors":"Sayan Ghosh, Moitreya Chatterjee, Louis-Philippe Morency","doi":"10.1145/2663204.2663274","DOIUrl":"https://doi.org/10.1145/2663204.2663274","url":null,"abstract":"The increasing prevalence of psychological distress disorders, such as depression and post-traumatic stress, necessitates a serious effort to create new tools and technologies to help with their diagnosis and treatment. In recent years, new computational approaches were proposed to objectively analyze patient non-verbal behaviors over the duration of the entire interaction between the patient and the clinician. In this paper, we go beyond non-verbal behaviors and propose a tri-modal approach which integrates verbal behaviors with acoustic and visual behaviors to analyze psychological distress during the course of the dyadic semi-structured interviews. Our approach exploits the advantages of the dyadic nature of these interactions to contextualize the participant responses based on the affective components (intimacy and polarity levels) of the questions. We validate our approach using one of the largest corpus of semi-structured interviews for distress assessment which consists of 154 multimodal dyadic interactions. Our results show significant improvement on distress prediction performance when integrating verbal behaviors with acoustic and visual behaviors. In addition, our analysis shows that contextualizing the responses improves the prediction performance, most significantly with positive and intimate questions.","PeriodicalId":389037,"journal":{"name":"Proceedings of the 16th International Conference on Multimodal Interaction","volume":"95 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129056559","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Session details: Keynote Address 4 会议详情:主题演讲
J. Cohn
{"title":"Session details: Keynote Address 4","authors":"J. Cohn","doi":"10.1145/3246752","DOIUrl":"https://doi.org/10.1145/3246752","url":null,"abstract":"","PeriodicalId":389037,"journal":{"name":"Proceedings of the 16th International Conference on Multimodal Interaction","volume":"81 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123987935","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Rhythmic Body Movements of Laughter 笑的有节奏的身体动作
Proceedings of the 16th International Conference on Multimodal Interaction Pub Date : 2014-11-12 DOI: 10.1145/2663204.2663240
Radoslaw Niewiadomski, M. Mancini, Yu Ding, C. Pelachaud, G. Volpe
{"title":"Rhythmic Body Movements of Laughter","authors":"Radoslaw Niewiadomski, M. Mancini, Yu Ding, C. Pelachaud, G. Volpe","doi":"10.1145/2663204.2663240","DOIUrl":"https://doi.org/10.1145/2663204.2663240","url":null,"abstract":"In this paper we focus on three aspects of multimodal expressions of laughter. First, we propose a procedural method to synthesize rhythmic body movements of laughter based on spectral analysis of laughter episodes. For this purpose, we analyze laughter body motions from motion capture data and we reconstruct them with appropriate harmonics. Then we reduce the parameter space to two dimensions. These are the inputs of the actual model to generate a continuum of laughs rhythmic body movements. In the paper, we also propose a method to integrate rhythmic body movements generated by our model with other synthetized expressive cues of laughter such as facial expressions and additional body movements. Finally, we present a real-time human-virtual character interaction scenario where virtual character applies our model to answer to human's laugh in real-time. ","PeriodicalId":389037,"journal":{"name":"Proceedings of the 16th International Conference on Multimodal Interaction","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129443089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 25
Session details: Oral Session 6: Healthcare and Assistive Technologies 会议详情:口头会议6:医疗保健和辅助技术
D. Bohus
{"title":"Session details: Oral Session 6: Healthcare and Assistive Technologies","authors":"D. Bohus","doi":"10.1145/3246751","DOIUrl":"https://doi.org/10.1145/3246751","url":null,"abstract":"","PeriodicalId":389037,"journal":{"name":"Proceedings of the 16th International Conference on Multimodal Interaction","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126338910","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Personal Aesthetics for Soft Biometrics: A Generative Multi-resolution Approach 软生物识别的个人美学:一种生成式多分辨率方法
Proceedings of the 16th International Conference on Multimodal Interaction Pub Date : 2014-11-12 DOI: 10.1145/2663204.2663259
Cristina Segalin, A. Perina, M. Cristani
{"title":"Personal Aesthetics for Soft Biometrics: A Generative Multi-resolution Approach","authors":"Cristina Segalin, A. Perina, M. Cristani","doi":"10.1145/2663204.2663259","DOIUrl":"https://doi.org/10.1145/2663204.2663259","url":null,"abstract":"Are we recognizable by our image preferences? This paper answers affirmatively the question, presenting a soft biometric approach where the preferred images of an individual are used as his personal signature in identification tasks. The approach builds a multi-resolution latent space, formed by multiple Counting Grids, where similar images are mapped nearby. On this space, a set of preferred images of a user produces an ensemble of intensity maps, highlighting in an intuitive way his personal aesthetic preferences. These maps are then used for learning a battery of discriminative classifiers (one for each resolution), which characterizes the user and serves to perform identification. Results are promising: on a dataset of 200 users, and 40K images, using 20 preferred images as biometric template gives 66% of probability of guessing the correct user. This makes the \"personal aesthetics\" a very hot topic for soft biometrics, while its usage in standard biometric applications seems to be far from being effective, as we show in a simple user study.","PeriodicalId":389037,"journal":{"name":"Proceedings of the 16th International Conference on Multimodal Interaction","volume":"293 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124208570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信