Proceedings of the 2015 ACM on International Conference on Multimodal Interaction最新文献

筛选
英文 中文
Retrieving Target Gestures Toward Speech Driven Animation with Meaningful Behaviors 基于有意义行为的语音驱动动画目标手势检索
Najmeh Sadoughi, C. Busso
{"title":"Retrieving Target Gestures Toward Speech Driven Animation with Meaningful Behaviors","authors":"Najmeh Sadoughi, C. Busso","doi":"10.1145/2818346.2820750","DOIUrl":"https://doi.org/10.1145/2818346.2820750","url":null,"abstract":"Creating believable behaviors for conversational agents (CAs) is a challenging task, given the complex relationship between speech and various nonverbal behaviors. The two main approaches are rule-based systems, which tend to produce behaviors with limited variations compared to natural interactions, and data-driven systems, which tend to ignore the underlying semantic meaning of the message (e.g., gestures without meaning). We envision a hybrid system, acting as the behavior realization layer in rule-based systems, while exploiting the rich variation in natural interactions. Constrained on a given target gesture (e.g., head nod) and speech signal, the system will generate novel realizations learned from the data, capturing the timely relationship between speech and gestures. An important task in this research is identifying multiple examples of the target gestures in the corpus. This paper proposes a data mining framework for detecting gestures of interest in a motion capture database. First, we train One-class support vector machines (SVMs) to detect candidate segments conveying the target gesture. Second, we use dynamic time alignment kernel (DTAK) to compare the similarity between the examples (i.e., target gesture) and the given segments. We evaluate the approach for five prototypical hand and head gestures showing reasonable performance. These retrieved gestures are then used to train a speech-driven framework based on dynamic Bayesian networks (DBNs) to synthesize these target behaviors.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"35 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73517836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Combining Multimodal Features within a Fusion Network for Emotion Recognition in the Wild 结合多模态特征的融合网络用于野外情绪识别
Bo Sun, Liandong Li, Guoyan Zhou, Xuewen Wu, Jun He, Lejun Yu, Dongxue Li, Qinglan Wei
{"title":"Combining Multimodal Features within a Fusion Network for Emotion Recognition in the Wild","authors":"Bo Sun, Liandong Li, Guoyan Zhou, Xuewen Wu, Jun He, Lejun Yu, Dongxue Li, Qinglan Wei","doi":"10.1145/2818346.2830586","DOIUrl":"https://doi.org/10.1145/2818346.2830586","url":null,"abstract":"In this paper, we describe our work in the third Emotion Recognition in the Wild (EmotiW 2015) Challenge. For each video clip, we extract MSDF, LBP-TOP, HOG, LPQ-TOP and acoustic features to recognize the emotions of film characters. For the static facial expression recognition based on video frame, we extract MSDF, DCNN and RCNN features. We train linear SVM classifiers for these kinds of features on the AFEW and SFEW dataset, and we propose a novel fusion network to combine all the extracted features at decision level. The final achievement we gained is 51.02% on the AFEW testing set and 51.08% on the SFEW testing set, which are much better than the baseline recognition rate of 39.33% and 39.13%.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"287 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75425783","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 48
Dynamic Active Learning Based on Agreement and Applied to Emotion Recognition in Spoken Interactions 基于一致性的动态主动学习及其在口语互动情绪识别中的应用
Yue Zhang, E. Coutinho, Zixing Zhang, C. Quan, Björn Schuller
{"title":"Dynamic Active Learning Based on Agreement and Applied to Emotion Recognition in Spoken Interactions","authors":"Yue Zhang, E. Coutinho, Zixing Zhang, C. Quan, Björn Schuller","doi":"10.1145/2818346.2820774","DOIUrl":"https://doi.org/10.1145/2818346.2820774","url":null,"abstract":"In this contribution, we propose a novel method for Active Learning (AL) - Dynamic Active Learning (DAL) - which targets the reduction of the costly human labelling work necessary for modelling subjective tasks such as emotion recognition in spoken interactions. The method implements an adaptive query strategy that minimises the amount of human labelling work by deciding for each instance whether it should automatically be labelled by machine or manually by human, as well as how many human annotators are required. Extensive experiments on standardised test-beds show that DAL significantly improves the efficiency of conventional AL. In particular, DAL achieves the same classification accuracy obtained with AL with up to 79.17% less human annotation effort.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"30 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77852769","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Sharing Touch Interfaces: Proximity-Sensitive Touch Targets for Tablet-Mediated Collaboration 共享触摸界面:平板电脑协作的接近敏感触摸目标
Ilhan Aslan, Thomas Meneweger, Verena Fuchsberger, M. Tscheligi
{"title":"Sharing Touch Interfaces: Proximity-Sensitive Touch Targets for Tablet-Mediated Collaboration","authors":"Ilhan Aslan, Thomas Meneweger, Verena Fuchsberger, M. Tscheligi","doi":"10.1145/2818346.2820740","DOIUrl":"https://doi.org/10.1145/2818346.2820740","url":null,"abstract":"During conversational practices, such as a tablet-mediated sales conversation between a salesperson and a customer, tablets are often used by two users who prefer specific bodily formations in order to easily face each other and the surface of the touchscreen. In a series of studies, we investigated bodily formations that are preferred during tablet-mediated sales conversations, and explored the effect of these formations on performance in acquiring touch targets (e.g., buttons) on a tablet device. We found that bodily formations cause decreased viewing angles to the shared screen, which results in a decreased performance in target acquisition. In order to address this issue, a multi-modal design consideration is presented, which combines mid-air finger movement and touch into a unified input modality, allowing the design of proximity sensitive touch targets. We conclude that the proposed embodied interaction design not only has potential to improve targeting performance, but also adapts the ``agency' of touch targets for multi-user settings.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"57 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85040437","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
CuddleBits: Friendly, Low-cost Furballs that Respond to Touch CuddleBits:友好、低成本的毛绒球,可以对触摸做出反应
Laura Cang, Paul Bucci, Karon E Maclean
{"title":"CuddleBits: Friendly, Low-cost Furballs that Respond to Touch","authors":"Laura Cang, Paul Bucci, Karon E Maclean","doi":"10.1145/2818346.2823293","DOIUrl":"https://doi.org/10.1145/2818346.2823293","url":null,"abstract":"We present a real-time touch gesture recognition system using a low-cost fabric pressure sensor mounted on a small zoomorphic robot, affectionately called the `CuddleBit'. We explore the relationship between gesture recognition and affect through the lens of human-robot interaction. We demonstrate our real-time gesture recognition system, including both software and hardware, and a haptic display that brings the CuddleBit to life.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"429 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83589054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Session details: Oral Session 3: Language, Speech and Dialog 口语部分:语言、演讲和对话
J. Lehman
{"title":"Session details: Oral Session 3: Language, Speech and Dialog","authors":"J. Lehman","doi":"10.1145/3252448","DOIUrl":"https://doi.org/10.1145/3252448","url":null,"abstract":"","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"41 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85435502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Analyzing Multimodality of Video for User Engagement Assessment 分析视频的多模态以评估用户粘性
F. Salim, F. Haider, Owen Conlan, S. Luz, N. Campbell
{"title":"Analyzing Multimodality of Video for User Engagement Assessment","authors":"F. Salim, F. Haider, Owen Conlan, S. Luz, N. Campbell","doi":"10.1145/2818346.2820775","DOIUrl":"https://doi.org/10.1145/2818346.2820775","url":null,"abstract":"These days, several hours of new video content is uploaded to the internet every second. It is simply impossible for anyone to see every piece of video which could be engaging or even useful to them. Therefore it is desirable to identify videos that might be regarded as engaging automatically, for a variety of applications such as recommendation and personalized video segmentation etc. This paper explores how multimodal characteristics of video, such as prosodic, visual and paralinguistic features, can help in assessing user engagement with videos. The approach proposed in this paper achieved good accuracy (maximum F score of 96.93 %) through a novel combination of features extracted directly from video recordings, demonstrating the potential of this method in identifying engaging content.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"30 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78485996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
The Application of Word Processor UI paradigms to Audio and Animation Editing 文字处理器UI范例在音频和动画编辑中的应用
A. D. Milota
{"title":"The Application of Word Processor UI paradigms to Audio and Animation Editing","authors":"A. D. Milota","doi":"10.1145/2818346.2823292","DOIUrl":"https://doi.org/10.1145/2818346.2823292","url":null,"abstract":"This demonstration showcases Quixotic, an audio editor, and Quintessence, an animation editor. Both appropriate many of the interaction techniques found in word processors, and allow users to more quickly create time-variant media. Our different approach to the interface aims to make recorded speech and simple animation into media that can be efficiently used for one-to-one asynchronous communications, quick note taking and documentation, as well as for idea refinement.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"160 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80101536","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multimodal Public Speaking Performance Assessment 多模式公共演讲表现评估
T. Wörtwein, Mathieu Chollet, Boris Schauerte, Louis-Philippe Morency, R. Stiefelhagen, Stefan Scherer
{"title":"Multimodal Public Speaking Performance Assessment","authors":"T. Wörtwein, Mathieu Chollet, Boris Schauerte, Louis-Philippe Morency, R. Stiefelhagen, Stefan Scherer","doi":"10.1145/2818346.2820762","DOIUrl":"https://doi.org/10.1145/2818346.2820762","url":null,"abstract":"The ability to speak proficiently in public is essential for many professions and in everyday life. Public speaking skills are difficult to master and require extensive training. Recent developments in technology enable new approaches for public speaking training that allow users to practice in engaging and interactive environments. Here, we focus on the automatic assessment of nonverbal behavior and multimodal modeling of public speaking behavior. We automatically identify audiovisual nonverbal behaviors that are correlated to expert judges' opinions of key performance aspects. These automatic assessments enable a virtual audience to provide feedback that is essential for training during a public speaking performance. We utilize multimodal ensemble tree learners to automatically approximate expert judges' evaluations to provide post-hoc performance assessments to the speakers. Our automatic performance evaluation is highly correlated with the experts' opinions with r = 0.745 for the overall performance assessments. We compare multimodal approaches with single modalities and find that the multimodal ensembles consistently outperform single modalities.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"43 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80239130","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 67
AttentiveLearner: Adaptive Mobile MOOC Learning via Implicit Cognitive States Inference AttentiveLearner:基于内隐认知状态推断的自适应移动MOOC学习
Xiang Xiao, Phuong Pham, Jingtao Wang
{"title":"AttentiveLearner: Adaptive Mobile MOOC Learning via Implicit Cognitive States Inference","authors":"Xiang Xiao, Phuong Pham, Jingtao Wang","doi":"10.1145/2818346.2823297","DOIUrl":"https://doi.org/10.1145/2818346.2823297","url":null,"abstract":"This demo presents AttentiveLearner, a mobile learning system optimized for consuming lecture videos in Massive Open Online Courses (MOOCs) and flipped classrooms. AttentiveLearner uses on-lens finger gestures for video control and captures learners' physiological states through implicit heart rate tracking on unmodified mobile phones. Through three user studies to date, we found AttentiveLearner easy to learn, and intuitive to use. The heart beat waveforms captured by AttentiveLearner can be used to infer learners' cognitive states and attention. AttentiveLearner may serve as a promising supplemental feedback channel orthogonal to today's learning analytics technologies.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"64 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80329291","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信