Companion Publication of the 2020 International Conference on Multimodal Interaction最新文献_第5页

Expanding the Role of Affective Phenomena in Multimodal Interaction Research 扩展情感现象在多模态交互研究中的作用

Companion Publication of the 2020 International Conference on Multimodal Interaction Pub Date : 2023-10-09 DOI: 10.1145/3577190.3614171

Leena Mathur, Maja Mataric, Louis-Philippe Morency

{"title":"Expanding the Role of Affective Phenomena in Multimodal Interaction Research","authors":"Leena Mathur, Maja Mataric, Louis-Philippe Morency","doi":"10.1145/3577190.3614171","DOIUrl":"https://doi.org/10.1145/3577190.3614171","url":null,"abstract":"In recent decades, the field of affective computing has made substantial progress in advancing the ability of AI systems to recognize and express affective phenomena, such as affect and emotions, during human-human and human-machine interactions. This paper describes our examination of research at the intersection of multimodal interaction and affective computing, with the objective of observing trends and identifying understudied areas. We examined over 16,000 papers from selected conferences in multimodal interaction, affective computing, and natural language processing: ACM International Conference on Multimodal Interaction, AAAC International Conference on Affective Computing and Intelligent Interaction, Annual Meeting of the Association for Computational Linguistics, and Conference on Empirical Methods in Natural Language Processing. We identified 910 affect-related papers and present our analysis of the role of affective phenomena in these papers. We find that this body of research has primarily focused on enabling machines to recognize or express affect and emotion; there has been limited research on how affect and emotion predictions might, in turn, be used by AI systems to enhance machine understanding of human social behaviors and cognitive states. Based on our analysis, we discuss directions to expand the role of affective phenomena in multimodal interaction research.","PeriodicalId":93171,"journal":{"name":"Companion Publication of the 2020 International Conference on Multimodal Interaction","volume":"105 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135044920","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

ACE: how Artificial Character Embodiment shapes user behaviour in multi-modal interaction ACE:在多模态交互中，人工角色体现如何塑造用户行为

Companion Publication of the 2020 International Conference on Multimodal Interaction Pub Date : 2023-10-09 DOI: 10.1145/3577190.3617134

Eleonora Ceccaldi, Beatrice Biancardi, Sara Falcone, Silvia Ferrando, Geoffrey Gorisse, Thomas Janssoone, Anna Martin Coesel, Pierre Raimbaud

引用次数: 0

Large language models in textual analysis for gesture selection 用于手势选择的文本分析中的大型语言模型

Companion Publication of the 2020 International Conference on Multimodal Interaction Pub Date : 2023-10-09 DOI: 10.1145/3577190.3614158

Laura Birka Hensel, Nutchanon Yongsatianchot, Parisa Torshizi, Elena Minucci, Stacy Marsella

{"title":"Large language models in textual analysis for gesture selection","authors":"Laura Birka Hensel, Nutchanon Yongsatianchot, Parisa Torshizi, Elena Minucci, Stacy Marsella","doi":"10.1145/3577190.3614158","DOIUrl":"https://doi.org/10.1145/3577190.3614158","url":null,"abstract":"Gestures perform a variety of communicative functions that powerfully influence human face-to-face interaction. How this communicative function is achieved varies greatly between individuals and depends on the role of the speaker and the context of the interaction. Approaches to automatic gesture generation vary not only in the degree to which they rely on data-driven techniques but also the degree to which they can produce context and speaker specific gestures. However, these approaches face two major challenges: The first is obtaining sufficient training data that is appropriate for the context and the goal of the application. The second is related to designer control to realize their specific intent for the application. Here, we approach these challenges by using large language models (LLMs) to show that these powerful models of large amounts of data can be adapted for gesture analysis and generation. Specifically, we used ChatGPT as a tool for suggesting context-specific gestures that can realize designer intent based on minimal prompts. We also find that ChatGPT can suggests novel yet appropriate gestures not present in the minimal training data. The use of LLMs is a promising avenue for gesture generation that reduce the need for laborious annotations and has the potential to flexibly and quickly adapt to different designer intents.","PeriodicalId":93171,"journal":{"name":"Companion Publication of the 2020 International Conference on Multimodal Interaction","volume":"88 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135045195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Using Explainability for Bias Mitigation: A Case Study for Fair Recruitment Assessment 利用可解释性减轻偏见:公平招聘评估的案例研究

Companion Publication of the 2020 International Conference on Multimodal Interaction Pub Date : 2023-10-09 DOI: 10.1145/3577190.3614170

Gizem Sogancioglu, Heysem Kaya, Albert Ali Salah

引用次数: 0

Automated Assessment of Pain (AAP) 疼痛自动评估(AAP)

Companion Publication of the 2020 International Conference on Multimodal Interaction Pub Date : 2023-10-09 DOI: 10.1145/3577190.3617147

Zakia Hammal, Steffen Walter, Nadia Berthouze

引用次数: 0

Breathing New Life into COPD Assessment: Multisensory Home-monitoring for Predicting Severity 为COPD评估注入新活力:多感官家庭监测预测严重程度

Companion Publication of the 2020 International Conference on Multimodal Interaction Pub Date : 2023-10-09 DOI: 10.1145/3577190.3614109

Zixuan Xiao, Michal Muszynski, Ričards Marcinkevičs, Lukas Zimmerli, Adam Daniel Ivankay, Dario Kohlbrenner, Manuel Kuhn, Yves Nordmann, Ulrich Muehlner, Christian Clarenbach, Julia E. Vogt, Thomas Brunschwiler

{"title":"Breathing New Life into COPD Assessment: Multisensory Home-monitoring for Predicting Severity","authors":"Zixuan Xiao, Michal Muszynski, Ričards Marcinkevičs, Lukas Zimmerli, Adam Daniel Ivankay, Dario Kohlbrenner, Manuel Kuhn, Yves Nordmann, Ulrich Muehlner, Christian Clarenbach, Julia E. Vogt, Thomas Brunschwiler","doi":"10.1145/3577190.3614109","DOIUrl":"https://doi.org/10.1145/3577190.3614109","url":null,"abstract":"Chronic obstructive pulmonary disease (COPD) is a significant public health issue, affecting more than 100 million people worldwide. Remote patient monitoring has shown great promise in the efficient management of patients with chronic diseases. This work presents the analysis of the data from a monitoring system developed to track COPD symptoms alongside patients’ self-reports. In particular, we investigate the assessment of COPD severity using multisensory home-monitoring device data acquired from 30 patients over a period of three months. We describe a comprehensive data pre-processing and feature engineering pipeline for multimodal data from the remote home-monitoring of COPD patients. We develop and validate predictive models forecasting i) the absolute and ii) differenced COPD Assessment Test (CAT) scores based on the multisensory data. The best obtained models achieve Pearson’s correlation coefficient of 0.93 and 0.37 for absolute and differenced CAT scores. In addition, we investigate the importance of individual sensor modalities for predicting CAT scores using group sparse regularization techniques. Our results suggest that feature groups indicative of the patient’s general condition, such as static medical and physiological information, date, spirometer, and air quality, are crucial for predicting the absolute CAT score. For predicting changes in CAT scores, sleep and physical activity features are most important, alongside the previous CAT score value. Our analysis demonstrates the potential of remote patient monitoring for COPD management and investigates which sensor modalities are most indicative of COPD severity as assessed by the CAT score. Our findings contribute to the development of effective and data-driven COPD management strategies.","PeriodicalId":93171,"journal":{"name":"Companion Publication of the 2020 International Conference on Multimodal Interaction","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135045691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Bridging Multimedia Modalities: Enhanced Multimodal AI Understanding and Intelligent Agents 桥接多媒体模式:增强多模态人工智能理解和智能代理

Companion Publication of the 2020 International Conference on Multimodal Interaction Pub Date : 2023-10-09 DOI: 10.1145/3577190.3614225

Sushant Gautam

{"title":"Bridging Multimedia Modalities: Enhanced Multimodal AI Understanding and Intelligent Agents","authors":"Sushant Gautam","doi":"10.1145/3577190.3614225","DOIUrl":"https://doi.org/10.1145/3577190.3614225","url":null,"abstract":"With the increasing availability of multimodal data, especially in the sports and medical domains, there is growing interest in developing Artificial Intelligence (AI) models capable of comprehending the world in a more holistic manner. Nevertheless, various challenges exist in multimodal understanding, including the integration of multiple modalities and the resolution of semantic gaps between them. The proposed research aims to leverage multiple input modalities for the multimodal understanding of AI models, enhancing their reasoning, generation, and intelligent behavior. The research objectives focus on developing novel methods for multimodal AI, integrating them into conversational agents with optimizations for domain-specific requirements. The research methodology encompasses literature review, data curation, model development and implementation, evaluation and performance analysis, domain-specific applications, and documentation and reporting. Ethical considerations will be thoroughly addressed, and a comprehensive research plan is outlined to provide guidance. The research contributes to the field of multimodal AI understanding and the advancement of sophisticated AI systems by experimenting with multimodal data to enhance the performance of state-of-the-art neural networks.","PeriodicalId":93171,"journal":{"name":"Companion Publication of the 2020 International Conference on Multimodal Interaction","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135045697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Conversational Grounding in Multimodal Dialog Systems 多模态对话系统的会话基础

Companion Publication of the 2020 International Conference on Multimodal Interaction Pub Date : 2023-10-09 DOI: 10.1145/3577190.3614226

Biswesh Mohapatra

{"title":"Conversational Grounding in Multimodal Dialog Systems","authors":"Biswesh Mohapatra","doi":"10.1145/3577190.3614226","DOIUrl":"https://doi.org/10.1145/3577190.3614226","url":null,"abstract":"The process of “conversational grounding” is an interactive process that has been studied extensively in cognitive science, whereby participants in a conversation check to make sure their interlocutors understand what is being referred to. This interactive process uses multiple modes of communication to establish the information between the participants. This could include information provided through eye-gaze, head movements, intonation in speech, along with the content of the speech. While the process is essential to successful communication between humans and between humans and machines, work needs to be done on testing and building the capabilities of the current dialogue system in managing conversational grounding, especially in multimodal medium of communication. Recent work such as Benotti and Blackburn [3] have shown the importance of conversational grounding in dialog systems and how current systems fail in them which is essential for the advancement of Embodied Conversational Agents and Social Robots. Thus my Ph.D. project aims to test, understand and improve the functioning of current dialog models with respect to Conversational Grounding.","PeriodicalId":93171,"journal":{"name":"Companion Publication of the 2020 International Conference on Multimodal Interaction","volume":"118 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135045701","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Neural Mixed Effects for Nonlinear Personalized Predictions 非线性个性化预测的神经混合效应

Companion Publication of the 2020 International Conference on Multimodal Interaction Pub Date : 2023-10-09 DOI: 10.1145/3577190.3614115

Torsten Wörtwein, Nicholas B. Allen, Lisa B. Sheeber, Randy P. Auerbach, Jeffrey F. Cohn, Louis-Philippe Morency

{"title":"Neural Mixed Effects for Nonlinear Personalized Predictions","authors":"Torsten Wörtwein, Nicholas B. Allen, Lisa B. Sheeber, Randy P. Auerbach, Jeffrey F. Cohn, Louis-Philippe Morency","doi":"10.1145/3577190.3614115","DOIUrl":"https://doi.org/10.1145/3577190.3614115","url":null,"abstract":"Personalized prediction is a machine learning approach that predicts a person’s future observations based on their past labeled observations and is typically used for sequential tasks, e.g., to predict daily mood ratings. When making personalized predictions, a model can combine two types of trends: (a) trends shared across people, i.e., person-generic trends, such as being happier on weekends, and (b) unique trends for each person, i.e., person-specific trends, such as a stressful weekly meeting. Mixed effect models are popular statistical models to study both trends by combining person-generic and person-specific parameters. Though linear mixed effect models are gaining popularity in machine learning by integrating them with neural networks, these integrations are currently limited to linear person-specific parameters: ruling out nonlinear person-specific trends. In this paper, we propose Neural Mixed Effect (NME) models to optimize nonlinear person-specific parameters anywhere in a neural network in a scalable manner1. NME combines the efficiency of neural network optimization with nonlinear mixed effects modeling. Empirically, we observe that NME improves performance across six unimodal and multimodal datasets, including a smartphone dataset to predict daily mood and a mother-adolescent dataset to predict affective state sequences where half the mothers experience symptoms of depression. Furthermore, we evaluate NME for two model architectures, including for neural conditional random fields (CRF) to predict affective state sequences where the CRF learns nonlinear person-specific temporal transitions between affective states. Analysis of these person-specific transitions on the mother-adolescent dataset shows interpretable trends related to the mother’s depression symptoms.","PeriodicalId":93171,"journal":{"name":"Companion Publication of the 2020 International Conference on Multimodal Interaction","volume":"274 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135045705","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Component attention network for multimodal dance improvisation recognition 多模态舞蹈即兴识别的分量注意网络

Companion Publication of the 2020 International Conference on Multimodal Interaction Pub Date : 2023-10-09 DOI: 10.1145/3577190.3614114

Jia Fu, Jiarui Tan, Wenjie Yin, Sepideh Pashami, Mårten Björkman

引用次数: 0