Proceedings of the 2015 ACM on International Conference on Multimodal Interaction最新文献

Retrieving Target Gestures Toward Speech Driven Animation with Meaningful Behaviors 基于有意义行为的语音驱动动画目标手势检索

Proceedings of the 2015 ACM on International Conference on Multimodal Interaction Pub Date : 2015-11-09 DOI: 10.1145/2818346.2820750

Najmeh Sadoughi, C. Busso

{"title":"Retrieving Target Gestures Toward Speech Driven Animation with Meaningful Behaviors","authors":"Najmeh Sadoughi, C. Busso","doi":"10.1145/2818346.2820750","DOIUrl":"https://doi.org/10.1145/2818346.2820750","url":null,"abstract":"Creating believable behaviors for conversational agents (CAs) is a challenging task, given the complex relationship between speech and various nonverbal behaviors. The two main approaches are rule-based systems, which tend to produce behaviors with limited variations compared to natural interactions, and data-driven systems, which tend to ignore the underlying semantic meaning of the message (e.g., gestures without meaning). We envision a hybrid system, acting as the behavior realization layer in rule-based systems, while exploiting the rich variation in natural interactions. Constrained on a given target gesture (e.g., head nod) and speech signal, the system will generate novel realizations learned from the data, capturing the timely relationship between speech and gestures. An important task in this research is identifying multiple examples of the target gestures in the corpus. This paper proposes a data mining framework for detecting gestures of interest in a motion capture database. First, we train One-class support vector machines (SVMs) to detect candidate segments conveying the target gesture. Second, we use dynamic time alignment kernel (DTAK) to compare the similarity between the examples (i.e., target gesture) and the given segments. We evaluate the approach for five prototypical hand and head gestures showing reasonable performance. These retrieved gestures are then used to train a speech-driven framework based on dynamic Bayesian networks (DBNs) to synthesize these target behaviors.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"35 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73517836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 16

Combining Multimodal Features within a Fusion Network for Emotion Recognition in the Wild 结合多模态特征的融合网络用于野外情绪识别

Proceedings of the 2015 ACM on International Conference on Multimodal Interaction Pub Date : 2015-11-09 DOI: 10.1145/2818346.2830586

Bo Sun, Liandong Li, Guoyan Zhou, Xuewen Wu, Jun He, Lejun Yu, Dongxue Li, Qinglan Wei

引用次数: 48

Dynamic Active Learning Based on Agreement and Applied to Emotion Recognition in Spoken Interactions 基于一致性的动态主动学习及其在口语互动情绪识别中的应用

Proceedings of the 2015 ACM on International Conference on Multimodal Interaction Pub Date : 2015-11-09 DOI: 10.1145/2818346.2820774

Yue Zhang, E. Coutinho, Zixing Zhang, C. Quan, Björn Schuller

引用次数: 21

Sharing Touch Interfaces: Proximity-Sensitive Touch Targets for Tablet-Mediated Collaboration 共享触摸界面:平板电脑协作的接近敏感触摸目标

Proceedings of the 2015 ACM on International Conference on Multimodal Interaction Pub Date : 2015-11-09 DOI: 10.1145/2818346.2820740

Ilhan Aslan, Thomas Meneweger, Verena Fuchsberger, M. Tscheligi

引用次数: 9

CuddleBits: Friendly, Low-cost Furballs that Respond to Touch CuddleBits:友好、低成本的毛绒球，可以对触摸做出反应

Proceedings of the 2015 ACM on International Conference on Multimodal Interaction Pub Date : 2015-11-09 DOI: 10.1145/2818346.2823293

Laura Cang, Paul Bucci, Karon E Maclean

引用次数: 10

Session details: Oral Session 3: Language, Speech and Dialog 口语部分:语言、演讲和对话

Proceedings of the 2015 ACM on International Conference on Multimodal Interaction Pub Date : 2015-11-09 DOI: 10.1145/3252448

J. Lehman

引用次数: 0

Analyzing Multimodality of Video for User Engagement Assessment 分析视频的多模态以评估用户粘性

Proceedings of the 2015 ACM on International Conference on Multimodal Interaction Pub Date : 2015-11-09 DOI: 10.1145/2818346.2820775

F. Salim, F. Haider, Owen Conlan, S. Luz, N. Campbell

引用次数: 3

The Application of Word Processor UI paradigms to Audio and Animation Editing 文字处理器UI范例在音频和动画编辑中的应用

Proceedings of the 2015 ACM on International Conference on Multimodal Interaction Pub Date : 2015-11-09 DOI: 10.1145/2818346.2823292

A. D. Milota

引用次数: 0

Multimodal Public Speaking Performance Assessment 多模式公共演讲表现评估

Proceedings of the 2015 ACM on International Conference on Multimodal Interaction Pub Date : 2015-11-09 DOI: 10.1145/2818346.2820762

T. Wörtwein, Mathieu Chollet, Boris Schauerte, Louis-Philippe Morency, R. Stiefelhagen, Stefan Scherer

{"title":"Multimodal Public Speaking Performance Assessment","authors":"T. Wörtwein, Mathieu Chollet, Boris Schauerte, Louis-Philippe Morency, R. Stiefelhagen, Stefan Scherer","doi":"10.1145/2818346.2820762","DOIUrl":"https://doi.org/10.1145/2818346.2820762","url":null,"abstract":"The ability to speak proficiently in public is essential for many professions and in everyday life. Public speaking skills are difficult to master and require extensive training. Recent developments in technology enable new approaches for public speaking training that allow users to practice in engaging and interactive environments. Here, we focus on the automatic assessment of nonverbal behavior and multimodal modeling of public speaking behavior. We automatically identify audiovisual nonverbal behaviors that are correlated to expert judges' opinions of key performance aspects. These automatic assessments enable a virtual audience to provide feedback that is essential for training during a public speaking performance. We utilize multimodal ensemble tree learners to automatically approximate expert judges' evaluations to provide post-hoc performance assessments to the speakers. Our automatic performance evaluation is highly correlated with the experts' opinions with r = 0.745 for the overall performance assessments. We compare multimodal approaches with single modalities and find that the multimodal ensembles consistently outperform single modalities.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"43 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80239130","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 67

AttentiveLearner: Adaptive Mobile MOOC Learning via Implicit Cognitive States Inference AttentiveLearner:基于内隐认知状态推断的自适应移动MOOC学习

Proceedings of the 2015 ACM on International Conference on Multimodal Interaction Pub Date : 2015-11-09 DOI: 10.1145/2818346.2823297

Xiang Xiao, Phuong Pham, Jingtao Wang

引用次数: 7