Proceedings of the 2015 ACM on International Conference on Multimodal Interaction最新文献

筛选
英文 中文
Multimodal Assessment of Teaching Behavior in Immersive Rehearsal Environment-TeachLivE 沉浸式排练环境下教学行为的多模态评价
R. Barmaki
{"title":"Multimodal Assessment of Teaching Behavior in Immersive Rehearsal Environment-TeachLivE","authors":"R. Barmaki","doi":"10.1145/2818346.2823306","DOIUrl":"https://doi.org/10.1145/2818346.2823306","url":null,"abstract":"Nonverbal behaviors such as facial expressions, eye contact, gestures, and body movements in general have strong impacts on the process of communicative interactions. Gestures play an important role in interpersonal communication in the classroom between student and teacher. To assist teachers with exhibiting open and positive nonverbal signals in their actual classroom, we have designed a multimodal teaching application with provisions for real-time feedback in coordination with our TeachLivE test-bed environment and its reflective application; ReflectLivE. Individuals walk into this virtual environment and interact with five virtual students shown on a large screen display. The recent research study is designed to have two settings (7-minute long each). In each of the settings, the participants are provided lesson plans from which they teach. All the participants are asked to take part in both settings, with half receiving automated real-time feedback about their body poses in the first session (group 1) and the other half receiving such feedback in the second session (group 2). Feedback is in the form of a visual indication each time the participant exhibits a closed stance. To create this automated feedback application, a closed posture corpus was collected and trained based on the existing TeachLivE teaching records. After each session, the participants take a post-questionnaire about their experience. We hypothesize that visual feedback improves positive body gestures for both groups during the feedback session, and that, for group 2, this persists into their second unaided session but, for group 1, improvements occur only during the second session.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"13 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72720552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Micro-opinion Sentiment Intensity Analysis and Summarization in Online Videos 网络视频中的微意见情绪强度分析与总结
Amir Zadeh
{"title":"Micro-opinion Sentiment Intensity Analysis and Summarization in Online Videos","authors":"Amir Zadeh","doi":"10.1145/2818346.2823317","DOIUrl":"https://doi.org/10.1145/2818346.2823317","url":null,"abstract":"There has been substantial progress in the field of text based sentiment analysis but little effort has been made to incorporate other modalities. Previous work in sentiment analysis has shown that using multimodal data yields to more accurate models of sentiment. Efforts have been made towards expressing sentiment as a spectrum of intensity rather than just positive or negative. Such models are useful not only for detection of positivity or negativity, but also giving out a score of how positive or negative a statement is. Based on the state of the art studies in sentiment analysis, prediction in terms of sentiment score is still far from accurate, even in large datasets [27]. Another challenge in sentiment analysis is dealing with small segments or micro opinions as they carry less context than large segments thus making analysis of the sentiment harder. This paper presents a Ph.D. thesis shaped towards comprehensive studies in multimodal micro-opinion sentiment intensity analysis.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"43 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74035397","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 26
Exploiting Multimodal Affect and Semantics to Identify Politically Persuasive Web Videos 利用多模态情感和语义学识别具有政治说服力的网络视频
Behjat Siddiquie, Dave Chisholm, Ajay Divakaran
{"title":"Exploiting Multimodal Affect and Semantics to Identify Politically Persuasive Web Videos","authors":"Behjat Siddiquie, Dave Chisholm, Ajay Divakaran","doi":"10.1145/2818346.2820732","DOIUrl":"https://doi.org/10.1145/2818346.2820732","url":null,"abstract":"We introduce the task of automatically classifying politically persuasive web videos and propose a highly effective multi-modal approach for this task. We extract audio, visual, and textual features that attempt to capture affect and semantics in the audio-visual content and sentiment in the viewers' comments. We demonstrate that each of the feature modalities can be used to classify politically persuasive content, and that fusing them leads to the best performance. We also perform experiments to examine human accuracy and inter-coder reliability for this task and show that our best automatic classifier slightly outperforms average human performance. Finally we show that politically persuasive videos generate more strongly negative viewer comments than non-persuasive videos and analyze how affective content can be used to predict viewer reactions.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"505 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77345738","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 37
A Distributed Architecture for Interacting with NAO 与NAO交互的分布式体系结构
Fabien Badeig, Quentin Pelorson, S. Arias, Vincent Drouard, I. D. Gebru, Xiaofei Li, Georgios D. Evangelidis, R. Horaud
{"title":"A Distributed Architecture for Interacting with NAO","authors":"Fabien Badeig, Quentin Pelorson, S. Arias, Vincent Drouard, I. D. Gebru, Xiaofei Li, Georgios D. Evangelidis, R. Horaud","doi":"10.1145/2818346.2823303","DOIUrl":"https://doi.org/10.1145/2818346.2823303","url":null,"abstract":"One of the main applications of the humanoid robot NAO - a small robot companion - is human-robot interaction (HRI). NAO is particularly well suited for HRI applications because of its design, hardware specifications, programming capabilities, and affordable cost. Indeed, NAO can stand up, walk, wander, dance, play soccer, sit down, recognize and grasp simple objects, detect and identify people, localize sounds, understand some spoken words, engage itself in simple and goal-directed dialogs, and synthesize speech. This is made possible due to the robot's 24 degree-of-freedom articulated structure (body, legs, feet, arms, hands, head, etc.), motors, cameras, microphones, etc., as well as to its on-board computing hardware and embedded software, e.g., robot motion control. Nevertheless, the current NAO configuration has two drawbacks that restrict the complexity of interactive behaviors that could potentially be implemented. Firstly, the on-board computing resources are inherently limited, which implies that it is difficult to implement sophisticated computer vision and audio signal analysis algorithms required by advanced interactive tasks. Secondly, programming new robot functionalities currently implies the development of embedded software, which is a difficult task in its own right necessitating specialized knowledge. The vast majority of HRI practitioners may not have this kind of expertise and hence they cannot easily and quickly implement their ideas, carry out thorough experimental validations, and design proof-of-concept demonstrators. We have developed a distributed software architecture that attempts to overcome these two limitations. Broadly speaking, NAO's on-board computing resources are augmented with external computing resources. The latter is a computer platform with its CPUs, GPUs, memory, operating system, libraries, software packages, internet access, etc. This configuration enables easy and fast development in Matlab, C, C++, or Python. Moreover, it allows the user to combine on-board libraries (motion control, face detection, etc.) with external toolboxes, e.g., OpenCv.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"74 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76388108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Challenges in Deep Learning for Multimodal Applications 深度学习在多模态应用中的挑战
Sayan Ghosh
{"title":"Challenges in Deep Learning for Multimodal Applications","authors":"Sayan Ghosh","doi":"10.1145/2818346.2823313","DOIUrl":"https://doi.org/10.1145/2818346.2823313","url":null,"abstract":"This consortium paper outlines a research plan for investigating deep learning techniques as applied to multimodal multi-task learning and multimodal fusion. We discuss our prior research results in this area, and how these results motivate us to explore more in this direction. We also define concrete steps of enquiry we wish to undertake as a short-term goal, and further outline some other challenges of multimodal learning using deep neural networks, such as inter and intra-modality synchronization, robustness to noise in modality data acquisition, and data insufficiency.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"36 6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77677328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Deciphering the Silent Participant: On the Use of Audio-Visual Cues for the Classification of Listener Categories in Group Discussions 解读沉默的参与者:利用视听线索对小组讨论中的听者类别进行分类
Catharine Oertel, Kenneth Alberto Funes Mora, Joakim Gustafson, J. Odobez
{"title":"Deciphering the Silent Participant: On the Use of Audio-Visual Cues for the Classification of Listener Categories in Group Discussions","authors":"Catharine Oertel, Kenneth Alberto Funes Mora, Joakim Gustafson, J. Odobez","doi":"10.1145/2818346.2820759","DOIUrl":"https://doi.org/10.1145/2818346.2820759","url":null,"abstract":"Estimating a silent participant's degree of engagement and his role within a group discussion can be challenging, as there are no speech related cues available at the given time. Having this information available, however, can provide important insights into the dynamics of the group as a whole. In this paper, we study the classification of listeners into several categories (attentive listener, side participant and bystander). We devised a thin-sliced perception test where subjects were asked to assess listener roles and engagement levels in 15-second video-clips taken from a corpus of group interviews. Results show that humans are usually able to assess silent participant roles. Using the annotation to identify from a set of multimodal low-level features, such as past speaking activity, backchannels (both visual and verbal), as well as gaze patterns, we could identify the features which are able to distinguish between different listener categories. Moreover, the results show that many of the audio-visual effects observed on listeners in dyadic interactions, also hold for multi-party interactions. A preliminary classifier achieves an accuracy of 64 %.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"17 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86895940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
Social Touch Gesture Recognition using Random Forest and Boosting on Distinct Feature Sets 基于随机森林和显著特征集增强的社交触摸手势识别
Y. F. A. Gaus, Temitayo A. Olugbade, Asim Jan, R. Qin, Jingxin Liu, Fan Zhang, H. Meng, N. Bianchi-Berthouze
{"title":"Social Touch Gesture Recognition using Random Forest and Boosting on Distinct Feature Sets","authors":"Y. F. A. Gaus, Temitayo A. Olugbade, Asim Jan, R. Qin, Jingxin Liu, Fan Zhang, H. Meng, N. Bianchi-Berthouze","doi":"10.1145/2818346.2830599","DOIUrl":"https://doi.org/10.1145/2818346.2830599","url":null,"abstract":"Touch is a primary nonverbal communication channel used to communicate emotions or other social messages. Despite its importance, this channel is still very little explored in the affective computing field, as much more focus has been placed on visual and aural channels. In this paper, we investigate the possibility to automatically discriminate between different social touch types. We propose five distinct feature sets for describing touch behaviours captured by a grid of pressure sensors. These features are then combined together by using the Random Forest and Boosting methods for categorizing the touch gesture type. The proposed methods were evaluated on both the HAART (7 gesture types over different surfaces) and the CoST (14 gesture types over the same surface) datasets made available by the Social Touch Gesture Challenge 2015. Well above chance level performances were achieved with a 67% accuracy for the HAART and 59% for the CoST testing datasets respectively.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"93 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83207166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
Session details: Doctoral Consortium 会议详情:博士联盟
C. Busso
{"title":"Session details: Doctoral Consortium","authors":"C. Busso","doi":"10.1145/3252454","DOIUrl":"https://doi.org/10.1145/3252454","url":null,"abstract":"","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"28 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89199725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Multimodal System for Public Speaking with Real Time Feedback 具有实时反馈的公共演讲多模式系统
F. Dermody, Alistair Sutherland
{"title":"A Multimodal System for Public Speaking with Real Time Feedback","authors":"F. Dermody, Alistair Sutherland","doi":"10.1145/2818346.2823295","DOIUrl":"https://doi.org/10.1145/2818346.2823295","url":null,"abstract":"We have developed a multimodal prototype for public speaking with real time feedback using the Microsoft Kinect. Effective speaking involves use of gesture, facial expression, posture, voice as well as the spoken word. These modalities combine to give the appearance of self-confidence in the speaker. This initial prototype detects body pose, facial expressions and voice. Visual and text feedback is displayed in real time to the user using a video panel, icon panel and text feedback panel. The user can also set and view elapsed time during their speaking performance. Real time feedback is displayed on gaze direction, body pose and gesture, vocal tonality, vocal dysfluencies and speaking rate.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"38 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85853786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
MPHA: A Personal Hearing Doctor Based on Mobile Devices MPHA:基于移动设备的个人听力医生
Yu-Hao Wu, Jia Jia, Wai-Kim Leung, Yejun Liu, Lianhong Cai
{"title":"MPHA: A Personal Hearing Doctor Based on Mobile Devices","authors":"Yu-Hao Wu, Jia Jia, Wai-Kim Leung, Yejun Liu, Lianhong Cai","doi":"10.1145/2818346.2820753","DOIUrl":"https://doi.org/10.1145/2818346.2820753","url":null,"abstract":"As more and more people inquire to know their hearing level condition, audiometry is becoming increasingly important. However, traditional audiometric method requires the involvement of audiometers, which are very expensive and time consuming. In this paper, we present mobile personal hearing assessment (MPHA), a novel interactive mode for testing hearing level based on mobile devices. MPHA, 1) provides a general method to calibrate sound intensity for mobile devices to guarantee the reliability and validity of the audiometry system; 2) designs an audiometric correction algorithm for the real noisy audiometric environment. The experimental results show that MPHA is reliable and valid compared with conventional audiometric assessment.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"31 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79184101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信