Companion Publication of the 2020 International Conference on Multimodal Interaction最新文献

筛选
英文 中文
Computational analyses of linguistic features with schizophrenic and autistic traits along with formal thought disorders 精神分裂症、自闭症及形式思维障碍患者语言特征的计算分析
Takeshi Saga, Hiroki Tanaka, Satoshi Nakamura
{"title":"Computational analyses of linguistic features with schizophrenic and autistic traits along with formal thought disorders","authors":"Takeshi Saga, Hiroki Tanaka, Satoshi Nakamura","doi":"10.1145/3577190.3614132","DOIUrl":"https://doi.org/10.1145/3577190.3614132","url":null,"abstract":"Formal Thought Disorder (FTD), which is a group of symptoms in cognition that affects language and thought, can be observed through language. FTD is seen across such developmental or psychiatric disorders as Autism Spectrum Disorder (ASD) or Schizophrenia, and its related Schizotypal Personality Disorder (SPD). Researchers have worked on computational analyses for the early detection of such symptoms and to develop better treatments more than 40 years. This paper collected a Japanese audio-report dataset with score labels related to ASD and SPD through a crowd-sourcing service from the general population. We measured language characteristics with the 2nd edition of the Social Responsiveness Scale (SRS2) and the Schizotypal Personality Questionnaire (SPQ), including an odd speech subscale from SPQ to quantize the FTD symptoms. We investigated the following four research questions through machine-learning-based score predictions: (RQ1) How are schizotypal and autistic measures correlated? (RQ2) What is the most suitable task to elicit FTD symptoms? (RQ3) Does the length of speech affect the elicitation of FTD symptoms? (RQ4) Which features are critical for capturing FTD symptoms? We confirmed that an FTD-related subscale, odd speech, was significantly correlated with both the total SPQ and SRS scores, although they themselves were not correlated significantly. In terms of the tasks, our result identified the effectiveness of FTD elicitation by the most negative memory. Furthermore, we confirmed that longer speech elicited more FTD symptoms as the increased score prediction performance of an FTD-related subscale odd speech from SPQ. Our ablation study confirmed the importance of function words and both the abstract and temporal features for FTD-related odd speech estimation. In contrast, embedding-based features were effective only in the SRS predictions, and content words were effective only in the SPQ predictions, a result that implies the differences of SPD-like and ASD-like symptoms. Data and programs used in this paper can be found here: https://sites.google.com/view/sagatake/resource.","PeriodicalId":93171,"journal":{"name":"Companion Publication of the 2020 International Conference on Multimodal Interaction","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135045196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Paying Attention to Wildfire: Using U-Net with Attention Blocks on Multimodal Data for Next Day Prediction 关注野火:在多模态数据上使用带注意块的U-Net进行次日预测
Jack Fitzgerald, Ethan Seefried, James E Yost, Sangmi Pallickara, Nathaniel Blanchard
{"title":"Paying Attention to Wildfire: Using U-Net with Attention Blocks on Multimodal Data for Next Day Prediction","authors":"Jack Fitzgerald, Ethan Seefried, James E Yost, Sangmi Pallickara, Nathaniel Blanchard","doi":"10.1145/3577190.3614116","DOIUrl":"https://doi.org/10.1145/3577190.3614116","url":null,"abstract":"Predicting where wildfires will spread provides invaluable information to firefighters and scientists, which can save lives and homes. However, doing so requires a large amount of multimodal data e.g., accurate weather predictions, real-time satellite data, and environmental descriptors. In this work, we utilize 12 distinct features from multiple modalities in order to predict where wildfires will spread over the next 24 hours. We created a custom U-Net architecture designed to train as efficiently as possible, while still maximizing accuracy, to facilitate quickly deploying the model when a wildfire is detected. Our custom architecture demonstrates state-of-the-art performance and trains an order of magnitude more quickly than prior work, while using fewer computational resources. We further evaluated our architecture with an ablation study to identify which features were key for prediction and which provided negligible impact on performance. All of our source code is available on GitHub1.","PeriodicalId":93171,"journal":{"name":"Companion Publication of the 2020 International Conference on Multimodal Interaction","volume":"2015 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135045692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The 5th Workshop on Modeling Socio-Emotional and Cognitive Processes from Multimodal Data in the Wild (MSECP-Wild) 第五届野外多模态数据建模社会情绪和认知过程研讨会(MSECP-Wild)
Bernd Dudzik, Tiffany Matej Hrkalovic, Dennis Küster, David St-Onge, Felix Putze, Laurence Devillers
{"title":"The 5th Workshop on Modeling Socio-Emotional and Cognitive Processes from Multimodal Data in the Wild (MSECP-Wild)","authors":"Bernd Dudzik, Tiffany Matej Hrkalovic, Dennis Küster, David St-Onge, Felix Putze, Laurence Devillers","doi":"10.1145/3577190.3616883","DOIUrl":"https://doi.org/10.1145/3577190.3616883","url":null,"abstract":"The ability to automatically infer relevant aspects of human users’ thoughts and feelings is crucial for technologies to intelligently adapt their behaviors in complex interactions. Research on multimodal analysis has demonstrated the potential of technology to provide such estimates for a broad range of internal states and processes. However, constructing robust approaches for deployment in real-world applications remains an open problem. The MSECP-Wild workshop series is a multidisciplinary forum to present and discuss research addressing this challenge. Submissions to this 5th iteration span efforts relevant to multimodal data collection, modeling, and applications. In addition, our workshop program builds on discussions emerging in previous iterations, highlighting ethical considerations when building and deploying technology modeling internal states in the wild. For this purpose, we host a range of relevant keynote speakers and interactive activities.","PeriodicalId":93171,"journal":{"name":"Companion Publication of the 2020 International Conference on Multimodal Interaction","volume":"273 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135045698","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Robot Just for You: Multimodal Personalized Human-Robot Interaction and the Future of Work and Care 一个适合你的机器人:多模式个性化人机交互和未来的工作和护理
Maja Mataric
{"title":"A Robot Just for You: Multimodal Personalized Human-Robot Interaction and the Future of Work and Care","authors":"Maja Mataric","doi":"10.1145/3577190.3616524","DOIUrl":"https://doi.org/10.1145/3577190.3616524","url":null,"abstract":"As AI becomes ubiquitous, its physical embodiment—robots–will also gradually enter our lives. As they do, we will demand that they understand us, predict our needs and wants, and adapt to us as we change our moods and minds, learn, grow, and age. The nexus created by recent major advances in machine learning for machine perception, navigation, and natural language processing has enabled human-robot interaction in real-world contexts, just as the need for human services continues to grow, from elder care to nursing to education and training. This talk will discuss our research in socially assistive robotics (SAR), which uses embodied social interaction to support user goals in health, wellness, training, and education. SAR brings together machine learning for user modeling, multimodal behavioral signal processing, and affective computing to enable robots to understand, interact, and adapt to users’ specific and ever-changing needs. The talk will cover methods and challenges of using multi-modal interaction data and expressive robot behavior to monitor, coach, motivate, and support a wide variety of user populations and use cases. We will cover insights from work with users across the age span (infants, children, adults, elderly), ability span (typically developing, autism, stroke, Alzheimer’s), contexts (schools, therapy centers, homes), and deployment durations (up to 6 months), as well as commercial implications.","PeriodicalId":93171,"journal":{"name":"Companion Publication of the 2020 International Conference on Multimodal Interaction","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135045703","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The UEA Digital Humans entry to the GENEA Challenge 2023 东安格利亚大学数字人类参加2023年GENEA挑战
Jonathan Windle, Iain Matthews, Ben Milner, Sarah Taylor
{"title":"The UEA Digital Humans entry to the GENEA Challenge 2023","authors":"Jonathan Windle, Iain Matthews, Ben Milner, Sarah Taylor","doi":"10.1145/3577190.3616116","DOIUrl":"https://doi.org/10.1145/3577190.3616116","url":null,"abstract":"This paper describes our entry to the GENEA (Generation and Evaluation of Non-verbal Behaviour for Embodied Agents) Challenge 2023. This year’s challenge focuses on generating gestures in a dyadic setting – predicting a main-agent’s motion from the speech of both the main-agent and an interlocutor. We adapt a Transformer-XL architecture for this task by adding a cross-attention module that integrates the interlocutor’s speech with that of the main-agent. Our model is conditioned on speech audio (encoded using PASE+), text (encoded using FastText) and a speaker identity label, and is able to generate smooth and speech appropriate gestures for a given identity. We consider the GENEA Challenge user study results and present a discussion of our model strengths and where improvements can be made.","PeriodicalId":93171,"journal":{"name":"Companion Publication of the 2020 International Conference on Multimodal Interaction","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135043298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Classification of Alzheimer's Disease with Deep Learning on Eye-tracking Data 基于眼动数据的深度学习阿尔茨海默病分类
Sriram, Harshinee, Conati, Cristina, Field, Thalia
{"title":"Classification of Alzheimer's Disease with Deep Learning on Eye-tracking Data","authors":"Sriram, Harshinee, Conati, Cristina, Field, Thalia","doi":"10.1145/3577190.3614149","DOIUrl":"https://doi.org/10.1145/3577190.3614149","url":null,"abstract":"Existing research has shown the potential of classifying Alzheimer's Disease (AD) from eye-tracking (ET) data with classifiers that rely on task-specific engineered features. In this paper, we investigate whether we can improve on existing results by using a Deep Learning classifier trained end-to-end on raw ET data. This classifier (VTNet) uses a GRU and a CNN in parallel to leverage both visual (V) and temporal (T) representations of ET data and was previously used to detect user confusion while processing visual displays. A main challenge in applying VTNet to our target AD classification task is that the available ET data sequences are much longer than those used in the previous confusion detection task, pushing the limits of what is manageable by LSTM-based models. We discuss how we address this challenge and show that VTNet outperforms the state-of-the-art approaches in AD classification, providing encouraging evidence on the generality of this model to make predictions from ET data.","PeriodicalId":93171,"journal":{"name":"Companion Publication of the 2020 International Conference on Multimodal Interaction","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135044316","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing Surgical Team Collaboration and Situation Awareness through Multimodal Sensing 通过多模态传感增强外科团队协作和态势感知
Arnaud Allemang--Trivalle
{"title":"Enhancing Surgical Team Collaboration and Situation Awareness through Multimodal Sensing","authors":"Arnaud Allemang--Trivalle","doi":"10.1145/3577190.3614233","DOIUrl":"https://doi.org/10.1145/3577190.3614233","url":null,"abstract":"Surgery, typically seen as the surgeon’s sole responsibility, requires a broader perspective acknowledging the vital roles of other operating room (OR) personnel. The interactions among team members are crucial for delivering quality care and depend on shared situation awareness. I propose a two-phase approach to design and evaluate a multimodal platform that monitors OR members, offering insights into surgical procedures. The first phase focuses on designing a data-collection platform, tailored to surgical constraints, to generate novel collaboration and situation-awareness metrics using synchronous recordings of the participants’ voices, positions, orientations, electrocardiograms, and respiration signals. The second phase concerns the creation of intuitive dashboards and visualizations, aiding surgeons in reviewing recorded surgery, identifying adverse events and contributing to proactive measures. This work aims to demonstrate an innovative approach to data collection and analysis, augmenting the surgical team’s capabilities. The multimodal platform has the potential to enhance collaboration, foster situation awareness, and ultimately mitigate surgical adverse events. This research sets the stage for a transformative shift in the OR, enabling a more holistic and inclusive perspective that recognizes that surgery is a team effort.","PeriodicalId":93171,"journal":{"name":"Companion Publication of the 2020 International Conference on Multimodal Interaction","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135044382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Analyzing and Recognizing Interlocutors' Gaze Functions from Multimodal Nonverbal Cues 从多模态非语言线索分析和识别对话者凝视功能
Ayane Tashiro, Mai Imamura, Shiro Kumano, Kazuhiro Otsuka
{"title":"Analyzing and Recognizing Interlocutors' Gaze Functions from Multimodal Nonverbal Cues","authors":"Ayane Tashiro, Mai Imamura, Shiro Kumano, Kazuhiro Otsuka","doi":"10.1145/3577190.3614152","DOIUrl":"https://doi.org/10.1145/3577190.3614152","url":null,"abstract":"A novel framework is presented for analyzing and recognizing the functions of gaze in group conversations. Considering the multiplicity and ambiguity of the gaze functions, we first define 43 nonexclusive gaze functions that play essential roles in conversations, such as monitoring, regulation, and expressiveness. Based on the defined functions, in this study, a functional gaze corpus is created, and a corpus analysis reveals several frequent functions, such as addressing and thinking while speaking and attending by listeners. Next, targeting the ten most frequent functions, we build convolutional neural networks (CNNs) to recognize the frame-based presence/absence of each gaze function from multimodal inputs, including head pose, utterance status, gaze/avert status, eyeball direction, and facial expression. Comparing different input sets, our experiments confirm that the proposed CNN using all modality inputs achieves the best performance and an F value of 0.839 for listening while looking.","PeriodicalId":93171,"journal":{"name":"Companion Publication of the 2020 International Conference on Multimodal Interaction","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135044537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep Breathing Phase Classification with a Social Robot for Mental Health 基于心理健康社交机器人的深呼吸阶段分类
Kayla Matheus, Ellie Mamantov, Marynel Vázquez, Brian Scassellati
{"title":"Deep Breathing Phase Classification with a Social Robot for Mental Health","authors":"Kayla Matheus, Ellie Mamantov, Marynel Vázquez, Brian Scassellati","doi":"10.1145/3577190.3614173","DOIUrl":"https://doi.org/10.1145/3577190.3614173","url":null,"abstract":"Social robots are in a unique position to aid mental health by supporting engagement with behavioral interventions. One such behavioral intervention is the practice of deep breathing, which has been shown to physiologically reduce symptoms of anxiety. Multiple robots have been recently developed that support deep breathing, but none yet implement a method to detect how accurately an individual is performing the practice. Detecting breathing phases (i.e., inhaling, breath holding, or exhaling) is a challenge with these robots since often the robot is being manipulated or moved by the user, or the robot itself is moving to generate haptic feedback. Accordingly, we first present OMMDB: a novel, multimodal, public dataset made up of individuals performing deep breathing with an Ommie robot in multiple conditions of robot ego-motion. The dataset includes RGB video, inertial sensor data, and motor encoder data, as well as ground truth breathing data from a respiration belt. Our second contribution features experimental results with a convolutional long-short term memory neural network trained using OMMDB. These results show the system’s ability to be applied to the domain of deep breathing and generalize between individual users. We additionally show that our model is able to generalize across multiple types of robot ego-motion, reducing the need to train individual models for varying human-robot interaction conditions.","PeriodicalId":93171,"journal":{"name":"Companion Publication of the 2020 International Conference on Multimodal Interaction","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135044670","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Synerg-eye-zing: Decoding Nonlinear Gaze Dynamics Underlying Successful Collaborations in Co-located Teams 协同注视:解码同地团队成功合作的非线性注视动力学
G. S. Rajshekar Reddy, Lucca Eloy, Rachel Dickler, Jason G. Reitman, Samuel L. Pugh, Peter W. Foltz, Jamie C. Gorman, Julie L. Harrison, Leanne Hirshfield
{"title":"Synerg-eye-zing: Decoding Nonlinear Gaze Dynamics Underlying Successful Collaborations in Co-located Teams","authors":"G. S. Rajshekar Reddy, Lucca Eloy, Rachel Dickler, Jason G. Reitman, Samuel L. Pugh, Peter W. Foltz, Jamie C. Gorman, Julie L. Harrison, Leanne Hirshfield","doi":"10.1145/3577190.3614104","DOIUrl":"https://doi.org/10.1145/3577190.3614104","url":null,"abstract":"Joint Visual Attention (JVA) has long been considered a critical component of successful collaborations, enabling coordination and construction of a shared knowledge space. However, recent studies challenge the notion that JVA alone ensures effective collaboration. To gain deeper insights into JVA’s influence, we examine nonlinear gaze coupling and gaze regularity in the collaborators’ visual attention. Specifically, we analyze gaze data from 19 dyadic and triadic teams engaged in a co-located programming task using Recurrence Quantification Analysis (RQA). Our results emphasize the significance of team-level gaze regularity for improving task performance - highlighting the importance of maintaining stable or sustained episodes of joint or individual attention, than disjointed patterns. Additionally, through regression analyses, we examine the predictive capacity of recurrence metrics for subjective traits such as social cohesion and social loafing, revealing unique interpersonal and team dynamics behind productive collaborations. We elaborate on our findings via qualitative anecdotes and discuss their implications in shaping real-time interventions for optimizing collaborative success.","PeriodicalId":93171,"journal":{"name":"Companion Publication of the 2020 International Conference on Multimodal Interaction","volume":"265 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135045700","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信