Companion Publication of the 2020 International Conference on Multimodal Interaction最新文献

筛选
英文 中文
Expanding the Role of Affective Phenomena in Multimodal Interaction Research 扩展情感现象在多模态交互研究中的作用
Leena Mathur, Maja Mataric, Louis-Philippe Morency
{"title":"Expanding the Role of Affective Phenomena in Multimodal Interaction Research","authors":"Leena Mathur, Maja Mataric, Louis-Philippe Morency","doi":"10.1145/3577190.3614171","DOIUrl":"https://doi.org/10.1145/3577190.3614171","url":null,"abstract":"In recent decades, the field of affective computing has made substantial progress in advancing the ability of AI systems to recognize and express affective phenomena, such as affect and emotions, during human-human and human-machine interactions. This paper describes our examination of research at the intersection of multimodal interaction and affective computing, with the objective of observing trends and identifying understudied areas. We examined over 16,000 papers from selected conferences in multimodal interaction, affective computing, and natural language processing: ACM International Conference on Multimodal Interaction, AAAC International Conference on Affective Computing and Intelligent Interaction, Annual Meeting of the Association for Computational Linguistics, and Conference on Empirical Methods in Natural Language Processing. We identified 910 affect-related papers and present our analysis of the role of affective phenomena in these papers. We find that this body of research has primarily focused on enabling machines to recognize or express affect and emotion; there has been limited research on how affect and emotion predictions might, in turn, be used by AI systems to enhance machine understanding of human social behaviors and cognitive states. Based on our analysis, we discuss directions to expand the role of affective phenomena in multimodal interaction research.","PeriodicalId":93171,"journal":{"name":"Companion Publication of the 2020 International Conference on Multimodal Interaction","volume":"105 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135044920","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ACE: how Artificial Character Embodiment shapes user behaviour in multi-modal interaction ACE:在多模态交互中,人工角色体现如何塑造用户行为
Eleonora Ceccaldi, Beatrice Biancardi, Sara Falcone, Silvia Ferrando, Geoffrey Gorisse, Thomas Janssoone, Anna Martin Coesel, Pierre Raimbaud
{"title":"ACE: how Artificial Character Embodiment shapes user behaviour in multi-modal interaction","authors":"Eleonora Ceccaldi, Beatrice Biancardi, Sara Falcone, Silvia Ferrando, Geoffrey Gorisse, Thomas Janssoone, Anna Martin Coesel, Pierre Raimbaud","doi":"10.1145/3577190.3617134","DOIUrl":"https://doi.org/10.1145/3577190.3617134","url":null,"abstract":"The ACE - how Artificial Character Embodiment shapes user behavior in multi-modal interactions - workshop aims to bring together researchers, practitioners and experts on the topic of embodiment, to analyze and foster discussion on its effects on user behavior in multi-modal interaction. ACE is aimed at stimulating multidisciplinary discussions on the topic, sharing recent progress, and providing participants with a forum to debate current and future challenges. The workshop includes contributions from computational, neuroscientific and psychological perspectives, as well as technical applications.","PeriodicalId":93171,"journal":{"name":"Companion Publication of the 2020 International Conference on Multimodal Interaction","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135045194","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Large language models in textual analysis for gesture selection 用于手势选择的文本分析中的大型语言模型
Laura Birka Hensel, Nutchanon Yongsatianchot, Parisa Torshizi, Elena Minucci, Stacy Marsella
{"title":"Large language models in textual analysis for gesture selection","authors":"Laura Birka Hensel, Nutchanon Yongsatianchot, Parisa Torshizi, Elena Minucci, Stacy Marsella","doi":"10.1145/3577190.3614158","DOIUrl":"https://doi.org/10.1145/3577190.3614158","url":null,"abstract":"Gestures perform a variety of communicative functions that powerfully influence human face-to-face interaction. How this communicative function is achieved varies greatly between individuals and depends on the role of the speaker and the context of the interaction. Approaches to automatic gesture generation vary not only in the degree to which they rely on data-driven techniques but also the degree to which they can produce context and speaker specific gestures. However, these approaches face two major challenges: The first is obtaining sufficient training data that is appropriate for the context and the goal of the application. The second is related to designer control to realize their specific intent for the application. Here, we approach these challenges by using large language models (LLMs) to show that these powerful models of large amounts of data can be adapted for gesture analysis and generation. Specifically, we used ChatGPT as a tool for suggesting context-specific gestures that can realize designer intent based on minimal prompts. We also find that ChatGPT can suggests novel yet appropriate gestures not present in the minimal training data. The use of LLMs is a promising avenue for gesture generation that reduce the need for laborious annotations and has the potential to flexibly and quickly adapt to different designer intents.","PeriodicalId":93171,"journal":{"name":"Companion Publication of the 2020 International Conference on Multimodal Interaction","volume":"88 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135045195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Using Explainability for Bias Mitigation: A Case Study for Fair Recruitment Assessment 利用可解释性减轻偏见:公平招聘评估的案例研究
Gizem Sogancioglu, Heysem Kaya, Albert Ali Salah
{"title":"Using Explainability for Bias Mitigation: A Case Study for Fair Recruitment Assessment","authors":"Gizem Sogancioglu, Heysem Kaya, Albert Ali Salah","doi":"10.1145/3577190.3614170","DOIUrl":"https://doi.org/10.1145/3577190.3614170","url":null,"abstract":"In this study, we propose a bias-mitigation algorithm, dubbed ProxyMute, that uses an explainability method to detect proxy features of a given sensitive attribute (e.g., gender) and reduces their effects on decisions by disabling them during prediction time. We evaluate our method for a job recruitment use-case, on two different multimodal datasets, namely, FairCVdb and ChaLearn LAP-FI. The exhaustive set of experiments shows that information regarding the proxy features that are provided by explainability methods is beneficial and can be successfully used for the problem of bias mitigation. Furthermore, when combined with a target label normalization method, the proposed approach shows a good performance by yielding one of the fairest results without deteriorating the performance significantly compared to previous works on both experimental datasets. The scripts to reproduce the results are available at: https://github.com/gizemsogancioglu/expl-bias-mitigation.","PeriodicalId":93171,"journal":{"name":"Companion Publication of the 2020 International Conference on Multimodal Interaction","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135045201","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automated Assessment of Pain (AAP) 疼痛自动评估(AAP)
Zakia Hammal, Steffen Walter, Nadia Berthouze
{"title":"Automated Assessment of Pain (AAP)","authors":"Zakia Hammal, Steffen Walter, Nadia Berthouze","doi":"10.1145/3577190.3617147","DOIUrl":"https://doi.org/10.1145/3577190.3617147","url":null,"abstract":"Pain communication varies, with some patients being highly expressive regarding their pain and others exhibiting stoic forbearance and minimal verbal account of discomfort. Considerable progress has been made in defining behavioral indices of pain [1-3]. An abundant literature shows that a limited subset of facial movements, in several non-human species, encode pain intensity across the lifespan [2]. To advance reliable pain monitoring, automated assessment of pain is emerging as a powerful mean to realize that goal. Though progress has been made, this field remains in its infancy. The workshop aims to promote current research and support growth of interdisciplinary collaborations to advance this groundbreaking research.","PeriodicalId":93171,"journal":{"name":"Companion Publication of the 2020 International Conference on Multimodal Interaction","volume":"105 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135045205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Breathing New Life into COPD Assessment: Multisensory Home-monitoring for Predicting Severity 为COPD评估注入新活力:多感官家庭监测预测严重程度
Zixuan Xiao, Michal Muszynski, Ričards Marcinkevičs, Lukas Zimmerli, Adam Daniel Ivankay, Dario Kohlbrenner, Manuel Kuhn, Yves Nordmann, Ulrich Muehlner, Christian Clarenbach, Julia E. Vogt, Thomas Brunschwiler
{"title":"Breathing New Life into COPD Assessment: Multisensory Home-monitoring for Predicting Severity","authors":"Zixuan Xiao, Michal Muszynski, Ričards Marcinkevičs, Lukas Zimmerli, Adam Daniel Ivankay, Dario Kohlbrenner, Manuel Kuhn, Yves Nordmann, Ulrich Muehlner, Christian Clarenbach, Julia E. Vogt, Thomas Brunschwiler","doi":"10.1145/3577190.3614109","DOIUrl":"https://doi.org/10.1145/3577190.3614109","url":null,"abstract":"Chronic obstructive pulmonary disease (COPD) is a significant public health issue, affecting more than 100 million people worldwide. Remote patient monitoring has shown great promise in the efficient management of patients with chronic diseases. This work presents the analysis of the data from a monitoring system developed to track COPD symptoms alongside patients’ self-reports. In particular, we investigate the assessment of COPD severity using multisensory home-monitoring device data acquired from 30 patients over a period of three months. We describe a comprehensive data pre-processing and feature engineering pipeline for multimodal data from the remote home-monitoring of COPD patients. We develop and validate predictive models forecasting i) the absolute and ii) differenced COPD Assessment Test (CAT) scores based on the multisensory data. The best obtained models achieve Pearson’s correlation coefficient of 0.93 and 0.37 for absolute and differenced CAT scores. In addition, we investigate the importance of individual sensor modalities for predicting CAT scores using group sparse regularization techniques. Our results suggest that feature groups indicative of the patient’s general condition, such as static medical and physiological information, date, spirometer, and air quality, are crucial for predicting the absolute CAT score. For predicting changes in CAT scores, sleep and physical activity features are most important, alongside the previous CAT score value. Our analysis demonstrates the potential of remote patient monitoring for COPD management and investigates which sensor modalities are most indicative of COPD severity as assessed by the CAT score. Our findings contribute to the development of effective and data-driven COPD management strategies.","PeriodicalId":93171,"journal":{"name":"Companion Publication of the 2020 International Conference on Multimodal Interaction","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135045691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bridging Multimedia Modalities: Enhanced Multimodal AI Understanding and Intelligent Agents 桥接多媒体模式:增强多模态人工智能理解和智能代理
Sushant Gautam
{"title":"Bridging Multimedia Modalities: Enhanced Multimodal AI Understanding and Intelligent Agents","authors":"Sushant Gautam","doi":"10.1145/3577190.3614225","DOIUrl":"https://doi.org/10.1145/3577190.3614225","url":null,"abstract":"With the increasing availability of multimodal data, especially in the sports and medical domains, there is growing interest in developing Artificial Intelligence (AI) models capable of comprehending the world in a more holistic manner. Nevertheless, various challenges exist in multimodal understanding, including the integration of multiple modalities and the resolution of semantic gaps between them. The proposed research aims to leverage multiple input modalities for the multimodal understanding of AI models, enhancing their reasoning, generation, and intelligent behavior. The research objectives focus on developing novel methods for multimodal AI, integrating them into conversational agents with optimizations for domain-specific requirements. The research methodology encompasses literature review, data curation, model development and implementation, evaluation and performance analysis, domain-specific applications, and documentation and reporting. Ethical considerations will be thoroughly addressed, and a comprehensive research plan is outlined to provide guidance. The research contributes to the field of multimodal AI understanding and the advancement of sophisticated AI systems by experimenting with multimodal data to enhance the performance of state-of-the-art neural networks.","PeriodicalId":93171,"journal":{"name":"Companion Publication of the 2020 International Conference on Multimodal Interaction","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135045697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Conversational Grounding in Multimodal Dialog Systems 多模态对话系统的会话基础
Biswesh Mohapatra
{"title":"Conversational Grounding in Multimodal Dialog Systems","authors":"Biswesh Mohapatra","doi":"10.1145/3577190.3614226","DOIUrl":"https://doi.org/10.1145/3577190.3614226","url":null,"abstract":"The process of “conversational grounding” is an interactive process that has been studied extensively in cognitive science, whereby participants in a conversation check to make sure their interlocutors understand what is being referred to. This interactive process uses multiple modes of communication to establish the information between the participants. This could include information provided through eye-gaze, head movements, intonation in speech, along with the content of the speech. While the process is essential to successful communication between humans and between humans and machines, work needs to be done on testing and building the capabilities of the current dialogue system in managing conversational grounding, especially in multimodal medium of communication. Recent work such as Benotti and Blackburn [3] have shown the importance of conversational grounding in dialog systems and how current systems fail in them which is essential for the advancement of Embodied Conversational Agents and Social Robots. Thus my Ph.D. project aims to test, understand and improve the functioning of current dialog models with respect to Conversational Grounding.","PeriodicalId":93171,"journal":{"name":"Companion Publication of the 2020 International Conference on Multimodal Interaction","volume":"118 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135045701","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Neural Mixed Effects for Nonlinear Personalized Predictions 非线性个性化预测的神经混合效应
Torsten Wörtwein, Nicholas B. Allen, Lisa B. Sheeber, Randy P. Auerbach, Jeffrey F. Cohn, Louis-Philippe Morency
{"title":"Neural Mixed Effects for Nonlinear Personalized Predictions","authors":"Torsten Wörtwein, Nicholas B. Allen, Lisa B. Sheeber, Randy P. Auerbach, Jeffrey F. Cohn, Louis-Philippe Morency","doi":"10.1145/3577190.3614115","DOIUrl":"https://doi.org/10.1145/3577190.3614115","url":null,"abstract":"Personalized prediction is a machine learning approach that predicts a person’s future observations based on their past labeled observations and is typically used for sequential tasks, e.g., to predict daily mood ratings. When making personalized predictions, a model can combine two types of trends: (a) trends shared across people, i.e., person-generic trends, such as being happier on weekends, and (b) unique trends for each person, i.e., person-specific trends, such as a stressful weekly meeting. Mixed effect models are popular statistical models to study both trends by combining person-generic and person-specific parameters. Though linear mixed effect models are gaining popularity in machine learning by integrating them with neural networks, these integrations are currently limited to linear person-specific parameters: ruling out nonlinear person-specific trends. In this paper, we propose Neural Mixed Effect (NME) models to optimize nonlinear person-specific parameters anywhere in a neural network in a scalable manner1. NME combines the efficiency of neural network optimization with nonlinear mixed effects modeling. Empirically, we observe that NME improves performance across six unimodal and multimodal datasets, including a smartphone dataset to predict daily mood and a mother-adolescent dataset to predict affective state sequences where half the mothers experience symptoms of depression. Furthermore, we evaluate NME for two model architectures, including for neural conditional random fields (CRF) to predict affective state sequences where the CRF learns nonlinear person-specific temporal transitions between affective states. Analysis of these person-specific transitions on the mother-adolescent dataset shows interpretable trends related to the mother’s depression symptoms.","PeriodicalId":93171,"journal":{"name":"Companion Publication of the 2020 International Conference on Multimodal Interaction","volume":"274 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135045705","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Component attention network for multimodal dance improvisation recognition 多模态舞蹈即兴识别的分量注意网络
Jia Fu, Jiarui Tan, Wenjie Yin, Sepideh Pashami, Mårten Björkman
{"title":"Component attention network for multimodal dance improvisation recognition","authors":"Jia Fu, Jiarui Tan, Wenjie Yin, Sepideh Pashami, Mårten Björkman","doi":"10.1145/3577190.3614114","DOIUrl":"https://doi.org/10.1145/3577190.3614114","url":null,"abstract":"Dance improvisation is an active research topic in the arts. Motion analysis of improvised dance can be challenging due to its unique dynamics. Data-driven dance motion analysis, including recognition and generation, is often limited to skeletal data. However, data of other modalities, such as audio, can be recorded and benefit downstream tasks. This paper explores the application and performance of multimodal fusion methods for human motion recognition in the context of dance improvisation. We propose an attention-based model, component attention network (CANet), for multimodal fusion on three levels: 1) feature fusion with CANet, 2) model fusion with CANet and graph convolutional network (GCN), and 3) late fusion with a voting strategy. We conduct thorough experiments to analyze the impact of each modality in different fusion methods and distinguish critical temporal or component features. We show that our proposed model outperforms the two baseline methods, demonstrating its potential for analyzing improvisation in dance.","PeriodicalId":93171,"journal":{"name":"Companion Publication of the 2020 International Conference on Multimodal Interaction","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135044383","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信