Companion Publication of the 2020 International Conference on Multimodal Interaction最新文献

筛选
英文 中文
Diffusion-Based Co-Speech Gesture Generation Using Joint Text and Audio Representation 基于扩散的文本和音频联合表示协同语音手势生成
Deichler, Anna, Mehta, Shivam, Alexanderson, Simon, Beskow, Jonas
{"title":"Diffusion-Based Co-Speech Gesture Generation Using Joint Text and Audio Representation","authors":"Deichler, Anna, Mehta, Shivam, Alexanderson, Simon, Beskow, Jonas","doi":"10.1145/3577190.3616117","DOIUrl":"https://doi.org/10.1145/3577190.3616117","url":null,"abstract":"This paper describes a system developed for the GENEA (Generation and Evaluation of Non-verbal Behaviour for Embodied Agents) Challenge 2023. Our solution builds on an existing diffusion-based motion synthesis model. We propose a contrastive speech and motion pretraining (CSMP) module, which learns a joint embedding for speech and gesture with the aim to learn a semantic coupling between these modalities. The output of the CSMP module is used as a conditioning signal in the diffusion-based gesture synthesis model in order to achieve semantically-aware co-speech gesture generation. Our entry achieved highest human-likeness and highest speech appropriateness rating among the submitted entries. This indicates that our system is a promising approach to achieve human-like co-speech gestures in agents that carry semantic meaning.","PeriodicalId":93171,"journal":{"name":"Companion Publication of the 2020 International Conference on Multimodal Interaction","volume":"138 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135043457","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
4th International Workshop on Multimodal Affect and Aesthetic Experience 第四届多模态情感与审美体验国际研讨会
Michal Muszynski, Theodoros Kostoulas, Leimin Tian, Edgar Roman-Rangel, Theodora Chaspari, Panos Amelidis
{"title":"4th International Workshop on Multimodal Affect and Aesthetic Experience","authors":"Michal Muszynski, Theodoros Kostoulas, Leimin Tian, Edgar Roman-Rangel, Theodora Chaspari, Panos Amelidis","doi":"10.1145/3577190.3616886","DOIUrl":"https://doi.org/10.1145/3577190.3616886","url":null,"abstract":"“Aesthetic experience” corresponds to the inner state of a person exposed to the form and content of artistic objects. Quantifying and interpreting the aesthetic experience of people in various contexts contribute towards a) creating context, and b) better understanding people’s affective reactions to aesthetic stimuli. Focusing on different types of artistic content, such as movie, music, literature, urban art, ancient artwork, and modern interactive technology, the 4th international workshop on Multimodal Affect and Aesthetic Experience (MAAE) aims to enhance interdisciplinary collaboration among researchers from affective computing, aesthetics, human-robot/computer interaction, digital archaeology and art, culture, ethics, and addictive games.","PeriodicalId":93171,"journal":{"name":"Companion Publication of the 2020 International Conference on Multimodal Interaction","volume":"273 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135044391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robot Duck Debugging: Can Attentive Listening Improve Problem Solving? 机器鸭调试:专心倾听能提高解决问题的能力吗?
Maria Teresa Parreira, Sarah Gillet, Iolanda Leite
{"title":"Robot Duck Debugging: Can Attentive Listening Improve Problem Solving?","authors":"Maria Teresa Parreira, Sarah Gillet, Iolanda Leite","doi":"10.1145/3577190.3614160","DOIUrl":"https://doi.org/10.1145/3577190.3614160","url":null,"abstract":"While thinking aloud has been reported to positively affect problem-solving, the effects of the presence of an embodied entity (e.g., a social robot) to whom words can be directed remain mostly unexplored. In this work, we investigated the role of a robot in a “rubber duck debugging” setting, by analyzing how a robot’s listening behaviors could support a thinking-aloud problem-solving session. Participants completed two different tasks while speaking their thoughts aloud to either a robot or an inanimate object (a giant rubber duck). We implemented and tested two types of listener behavior in the robot: a rule-based heuristic and a deep-learning-based model. In a between-subject user study with 101 participants, we evaluated how the presence of a robot affected users’ engagement in thinking aloud, behavior during the task, and self-reported user experience. In addition, we explored the impact of the two robot listening behaviors on those measures. In contrast to prior work, our results indicate that neither the rule-based heuristic nor the deep learning robot conditions improved performance or perception of the task, compared to an inanimate object. We discuss potential explanations and shed light on the feasibility of designing social robots as assistive tools in thinking-aloud problem-solving tasks.","PeriodicalId":93171,"journal":{"name":"Companion Publication of the 2020 International Conference on Multimodal Interaction","volume":"94 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135044665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Implicit Search Intent Recognition using EEG and Eye Tracking: Novel Dataset and Cross-User Prediction 基于脑电和眼动追踪的隐式搜索意图识别:新数据集和跨用户预测
Mansi Sharma, Shuang Chen, Philipp Müller, Maurice Rekrut, Antonio Krüger
{"title":"Implicit Search Intent Recognition using EEG and Eye Tracking: Novel Dataset and Cross-User Prediction","authors":"Mansi Sharma, Shuang Chen, Philipp Müller, Maurice Rekrut, Antonio Krüger","doi":"10.1145/3577190.3614166","DOIUrl":"https://doi.org/10.1145/3577190.3614166","url":null,"abstract":"For machines to effectively assist humans in challenging visual search tasks, they must differentiate whether a human is simply glancing into a scene (navigational intent) or searching for a target object (informational intent). Previous research proposed combining electroencephalography (EEG) and eye-tracking measurements to recognize such search intents implicitly, i.e., without explicit user input. However, the applicability of these approaches to real-world scenarios suffers from two key limitations. First, previous work used fixed search times in the informational intent condition - a stark contrast to visual search, which naturally terminates when the target is found. Second, methods incorporating EEG measurements addressed prediction scenarios that require ground truth training data from the target user, which is impractical in many use cases. We address these limitations by making the first publicly available EEG and eye-tracking dataset for navigational vs. informational intent recognition, where the user determines search times. We present the first method for cross-user prediction of search intents from EEG and eye-tracking recordings and reach accuracy in leave-one-user-out evaluations - comparable to within-user prediction accuracy () but offering much greater flexibility.","PeriodicalId":93171,"journal":{"name":"Companion Publication of the 2020 International Conference on Multimodal Interaction","volume":"94 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135044911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TongueTap: Multimodal Tongue Gesture Recognition with Head-Worn Devices TongueTap:多模态舌手势识别与头戴式设备
Tan Gemicioglu, R. Michael Winters, Yu-Te Wang, Thomas M. Gable, Ivan J. Tashev
{"title":"TongueTap: Multimodal Tongue Gesture Recognition with Head-Worn Devices","authors":"Tan Gemicioglu, R. Michael Winters, Yu-Te Wang, Thomas M. Gable, Ivan J. Tashev","doi":"10.1145/3577190.3614120","DOIUrl":"https://doi.org/10.1145/3577190.3614120","url":null,"abstract":"Mouth-based interfaces are a promising new approach enabling silent, hands-free and eyes-free interaction with wearable devices. However, interfaces sensing mouth movements are traditionally custom-designed and placed near or within the mouth. TongueTap synchronizes multimodal EEG, PPG, IMU, eye tracking and head tracking data from two commercial headsets to facilitate tongue gesture recognition using only off-the-shelf devices on the upper face. We classified eight closed-mouth tongue gestures with 94% accuracy, offering an invisible and inaudible method for discreet control of head-worn devices. Moreover, we found that the IMU alone differentiates eight gestures with 80% accuracy and a subset of four gestures with 92% accuracy. We built a dataset of 48,000 gesture trials across 16 participants, allowing TongueTap to perform user-independent classification. Our findings suggest tongue gestures can be a viable interaction technique for VR/AR headsets and earables without requiring novel hardware.","PeriodicalId":93171,"journal":{"name":"Companion Publication of the 2020 International Conference on Multimodal Interaction","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135045197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Ether-Mark: An Off-Screen Marking Menu For Mobile Devices ethernet - mark:移动设备的屏幕外标记菜单
Hanae Rateau, Yosra Rekik, Edward Lank
{"title":"Ether-Mark: An Off-Screen Marking Menu For Mobile Devices","authors":"Hanae Rateau, Yosra Rekik, Edward Lank","doi":"10.1145/3577190.3614150","DOIUrl":"https://doi.org/10.1145/3577190.3614150","url":null,"abstract":"Given the computing power of mobile devices, porting feature-rich applications to these devices is increasingly feasible. However, feature-rich applications include large command sets, and providing access to these commands through screen-based widgets results in issues of occlusion and layering. To address this issue, we introduce Ether-Mark, a hierarchical, gesture-based, marking menu inspired, around-device menu for mobile devices enabling both on- and near-device interaction. We investigate the design of such menus and their learnability through three experiments. We first design and contrast three variants of Ether-Mark, yielding a zigzag menu design. We then refine input accuracy via a deformation model of the menu. And, we evaluate the learnability of the menus and the accuracy of the deformation model, revealing an accuracy rate up to 98.28%. We finally, compare in-air Ether-Mark with marking menus.Our results argue for Ether-Mark as a promising effective mechanism to leverage proximal around-device space.","PeriodicalId":93171,"journal":{"name":"Companion Publication of the 2020 International Conference on Multimodal Interaction","volume":"185 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135045198","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
4th Workshop on Social Affective Multimodal Interaction for Health (SAMIH) 第四届社会情感多模态互动促进健康研讨会
Hiroki Tanaka, Satoshi Nakamura, Jean-Claude Martin, Catherine Pelachaud
{"title":"4th Workshop on Social Affective Multimodal Interaction for Health (SAMIH)","authors":"Hiroki Tanaka, Satoshi Nakamura, Jean-Claude Martin, Catherine Pelachaud","doi":"10.1145/3577190.3616878","DOIUrl":"https://doi.org/10.1145/3577190.3616878","url":null,"abstract":"This workshop discusses how interactive, multimodal technology, such as virtual agents, can measure and train social-affective interactions. Sensing technology now enables analyzing users’ behaviors and physiological signals. Various signal processing and machine learning methods can be used for prediction tasks. Such social signal processing and tools can be applied to measure and reduce social stress in everyday situations, including public speaking at schools and workplaces.","PeriodicalId":93171,"journal":{"name":"Companion Publication of the 2020 International Conference on Multimodal Interaction","volume":"2020 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135045203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Recording multimodal pair-programming dialogue for reference resolution by conversational agents 记录多模态配对编程对话,以供会话代理进行参考解析
Cecilia Domingo
{"title":"Recording multimodal pair-programming dialogue for reference resolution by conversational agents","authors":"Cecilia Domingo","doi":"10.1145/3577190.3614231","DOIUrl":"https://doi.org/10.1145/3577190.3614231","url":null,"abstract":"Pair programming is a collaborative technique which has proven highly beneficial in terms of the code produced and the learning gains for programmers. With recent advances in Programming Language Processing (PLP), numerous tools have been created that assist programmers in non-collaborative settings (i.e., where the technology provides users with a solution, instead of discussing the problem to develop a solution together). How can we develop AI that can assist in pair programming, a collaborative setting? To tackle this task, we begin by gathering multimodal dialogue data which can be used to train systems in a basic subtask of dialogue understanding: multimodal reference resolution, i.e., understanding which parts of a program are being mentioned by users through speech or by using the mouse and keyboard.","PeriodicalId":93171,"journal":{"name":"Companion Publication of the 2020 International Conference on Multimodal Interaction","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135044381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Modeling Social Cognition and its Neurologic Deficits with Artificial Neural Networks 用人工神经网络模拟社会认知及其神经缺陷
Laurent P. Mertens
{"title":"Modeling Social Cognition and its Neurologic Deficits with Artificial Neural Networks","authors":"Laurent P. Mertens","doi":"10.1145/3577190.3614232","DOIUrl":"https://doi.org/10.1145/3577190.3614232","url":null,"abstract":"Artificial Neural Networks (ANNs) are computer models loosely inspired by the functioning of the human brain. They are the state-of-the-art method for tackling a variety of Artificial Intelligence (AI) problems, and an increasingly popular tool in neuroscientific studies. However, both domains pursue different goals: in AI, performance is key and brain resemblance is incidental, while in neuroscience the aim is chiefly to better understand the brain. This PhD is situated at the intersection of both disciplines. Its goal is to develop ANNs that model social cognition in neurotypical individuals, and that can be altered in a controlled way to exhibit behavior consistent with individuals with one of two clinical conditions, Autism Spectrum Disorder and Frontotemporal Dementia.","PeriodicalId":93171,"journal":{"name":"Companion Publication of the 2020 International Conference on Multimodal Interaction","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135044384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Role of Audiovisual Feedback Delays and Bimodal Congruency for Visuomotor Performance in Human-Machine Interaction 视听反馈延迟和双峰一致性对人机交互中视觉运动表现的影响
Annika Dix, Clarissa Sabrina Arlinghaus, A. Marie Harkin, Sebastian Pannasch
{"title":"The Role of Audiovisual Feedback Delays and Bimodal Congruency for Visuomotor Performance in Human-Machine Interaction","authors":"Annika Dix, Clarissa Sabrina Arlinghaus, A. Marie Harkin, Sebastian Pannasch","doi":"10.1145/3577190.3614111","DOIUrl":"https://doi.org/10.1145/3577190.3614111","url":null,"abstract":"Despite incredible technological progress in the last decades, latency is still an issue for today's technologies and their applications. To better understand how latency and resulting feedback delays affect the interaction between humans and cyber-physical systems (CPS), the present study examines separate and joint effects of visual and auditory feedback delays on performance and the motor control strategy in a complex visuomotor task. Thirty-six participants played the Wire Loop Game, a fine motor skill task, while going through four different delay conditions: no delay, visual only, auditory only, and audiovisual (length: 200 ms). Participants’ speed and accuracy for completing the task and movement kinematic were assessed. Visual feedback delays slowed down movement execution and impaired precision compared to a condition without feedback delays. In contrast, delayed auditory feedback improved precision. Descriptively, the latter finding mainly appeared when congruent visual and auditory feedback delays were provided. We discuss the role of temporal congruency of audiovisual information as well as potential compensatory mechanisms that can inform the design of multisensory feedback in human-CPS interaction faced with latency.","PeriodicalId":93171,"journal":{"name":"Companion Publication of the 2020 International Conference on Multimodal Interaction","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135044386","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信