Companion Publication of the 2022 International Conference on Multimodal Interaction最新文献

筛选
英文 中文
Mattpod: A Design Proposal for a Multi-Sensory Solo Dining Experience Mattpod:一个多感官独自用餐体验的设计方案
Mimi Bocanegra, Mailin Lemke, R. D. de Vries, Geke D. S. Ludden
{"title":"Mattpod: A Design Proposal for a Multi-Sensory Solo Dining Experience","authors":"Mimi Bocanegra, Mailin Lemke, R. D. de Vries, Geke D. S. Ludden","doi":"10.1145/3536220.3563688","DOIUrl":"https://doi.org/10.1145/3536220.3563688","url":null,"abstract":"The consumption of a meal is not just a bodily requirement but can also carry significant symbolic meaning. Solo dining is often contrasted to a shared eating experience and portrayed as an inferior way of eating a meal due to lacking essential social and normative qualities. Human-computer interaction research increasingly explores different ways of enhancing the solo dining experience. However, a focus seems to be on recreating aspects essential to the shared eating experience, such as a dining companion being present, rather than trying to enhance aspects that solo diners enjoy and, therefore, contribute to a reverie in eating. Based on earlier research findings, we developed a design concept that includes sound and visual elements supporting the multi-sensory eating experience and encouraging the user to concentrate on the food rather than seeking distraction. The formative usability evaluation results indicate that the proposed design needs further refinement to evoke the anticipated effect.","PeriodicalId":186796,"journal":{"name":"Companion Publication of the 2022 International Conference on Multimodal Interaction","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123467811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Predicting Backchannel Signaling in Child-Caregiver Multimodal Conversations 预测儿童-照顾者多模式对话中的反向通道信号
J. Liu, Mitja Nikolaus, Kubra Bodur, Abdellah Fourtassi
{"title":"Predicting Backchannel Signaling in Child-Caregiver Multimodal Conversations","authors":"J. Liu, Mitja Nikolaus, Kubra Bodur, Abdellah Fourtassi","doi":"10.1145/3536220.3563372","DOIUrl":"https://doi.org/10.1145/3536220.3563372","url":null,"abstract":"Conversation requires cooperative social interaction between interlocutors. In particular, active listening through backchannel signaling (hereafter BC) i.e., showing attention through verbal (short responses like “Yeah”) and non-verbal behaviors (e.g. smiling or nodding) is crucial to managing the flow of a conversation and it requires sophisticated coordination skills. How does BC develop in childhood? Previous studies were either conducted in highly controlled experimental settings or relied on qualitative corpus analysis, which does not allow for a proper understanding of children’s BC development, especially in terms of its collaborative/coordinated use. This paper aims at filling this gap using a machine learning model that learns to predict children’s BC production based on the interlocutor’s inviting cues in child-caregiver naturalistic conversations. By comparing BC predictability across children and adults, we found that, contrary to what has been suggested in previous in-lab studies, children between the ages of 6 and 12 can actually produce and respond to backchannel inviting cues as consistently as adults do, suggesting an adult-like form of coordination.","PeriodicalId":186796,"journal":{"name":"Companion Publication of the 2022 International Conference on Multimodal Interaction","volume":"227 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116902030","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Towards Integration of Embodiment Features for Prosodic Prominence Prediction from Text 面向文本韵律突出预测的体现特征集成研究
P. Madhyastha
{"title":"Towards Integration of Embodiment Features for Prosodic Prominence Prediction from Text","authors":"P. Madhyastha","doi":"10.1145/3536220.3558540","DOIUrl":"https://doi.org/10.1145/3536220.3558540","url":null,"abstract":"Prosodic prominence prediction is an important task in the area of speech processing and especially forms an essential part of modern text-to-speech systems. Previous work has broadly focused on acoustic and linguistic features (such as syntactic and semantic features) for predicting prosodic prominence. However, human models of prosody are known to be highly multimodal and grounded on denotations of physical entities and embodied experience. In this paper we present a first study where we integrate multimodal sensorimotor associations by exploiting the Lancaster Sensorimotor Norms towards prosodic prominence prediction. Our results highlight the importance of sensorimotor knowledge especially for models in low-data regimens where we show that it improves the performance by a significant margin.","PeriodicalId":186796,"journal":{"name":"Companion Publication of the 2022 International Conference on Multimodal Interaction","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122374631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Real-time Public Speaking Anxiety Prediction Model for Oral Presentations 演讲焦虑的实时预测模型
Everlyne Kimani, T. Bickmore, Rosalind W. Picard, Matthew Goodwin, H. Jimison
{"title":"Real-time Public Speaking Anxiety Prediction Model for Oral Presentations","authors":"Everlyne Kimani, T. Bickmore, Rosalind W. Picard, Matthew Goodwin, H. Jimison","doi":"10.1145/3536220.3563686","DOIUrl":"https://doi.org/10.1145/3536220.3563686","url":null,"abstract":"Oral presentation skills are essential for most people’s academic and career development. However, due to public speaking anxiety, many people find oral presentations challenging and often avoid them to the detriment of their careers. Public speaking anxiety interventions that help presenters manage their anxiety as it occurs during a presentation can help many presenters. In this paper, we present a model for assessing public speaking anxiety during a presentation—a first step towards developing real-time anxiety interventions. We present our method for ground truth data collection and the results of neural network models for real-time anxiety detection using audio data. Our results show that using an LSTM model we can predict moments of speaking anxiety during a presentation.","PeriodicalId":186796,"journal":{"name":"Companion Publication of the 2022 International Conference on Multimodal Interaction","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126670017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Endowing Spiking Neural Networks with Homeostatic Adaptivity for APS-DVS Bimodal Scenarios APS-DVS双峰情景下赋予尖峰神经网络稳态自适应
M. Xu, Faqiang Liu, Jing Pei
{"title":"Endowing Spiking Neural Networks with Homeostatic Adaptivity for APS-DVS Bimodal Scenarios","authors":"M. Xu, Faqiang Liu, Jing Pei","doi":"10.1145/3536220.3563690","DOIUrl":"https://doi.org/10.1145/3536220.3563690","url":null,"abstract":"Plastic changes with intrinsic dynamics in synaptic efficacy underlie the cellular level of expression of brain functions regarding multimodal information processing. Among diverse plasticity mechanisms, synaptic scaling exerts indispensable effects on the homeostatic state maintenance and synaptic strength regulation in biological neural networks. Despite recent tremendous progress in developing spiking neural networks (SNNs) for multiple complex scenarios, most of the work remains in the pure backpropagation-based framework where the synaptic scaling mechanism is rarely effectively incorporated. In this work, we present a biologically inspired neuronal model with an activity-dependent adaptive synaptic scaling mechanism that endows each synapse with both short-term enhancement and depression properties. The learning process is completed in two phases. Firstly, in the forward conduction circuits, adaptive short-term enhancement or depression response is triggered in the light of afferent stimuli intensity; Then long-term consolidation is executed by back-propagated error signals. These processes dramatically shape the pattern selectivity of synapses and the diverse information transfer they mediate. Experiments reveal remarkable advantages in three tasks regarding bimodal learning. Specifically, On the continual learning and perturbation-resistant task for Dynamic Vision Sensor (DVS) modal information, our method improves the mean accuracy on the benchmark of N-MNIST dataset than the baseline by and , respectively. On sequence learning task for Active Pixel Sensor (APS) model information, our method improve the generalization capability and training stability by a large margin. These results demonstrate favourable effectiveness of such non-parametric adaptive strategy on bimodal information inference for APS and DVS, facilitating intelligence understanding and bio-inspired modelling.","PeriodicalId":186796,"journal":{"name":"Companion Publication of the 2022 International Conference on Multimodal Interaction","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134027217","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Towards Automatic Prediction of Non-Expert Perceived Speech Fluency Ratings 非专家感知语音流利度评分的自动预测
S. P. Dubagunta, Edoardo Moneta, Eleni Theocharopoulos, Mathew Magimai Doss
{"title":"Towards Automatic Prediction of Non-Expert Perceived Speech Fluency Ratings","authors":"S. P. Dubagunta, Edoardo Moneta, Eleni Theocharopoulos, Mathew Magimai Doss","doi":"10.1145/3536220.3563689","DOIUrl":"https://doi.org/10.1145/3536220.3563689","url":null,"abstract":"Automatic speech fluency prediction has been mainly approached from the perspective of computer aided language learning, where the system tends to predict ratings similar to those of the human experts. Speech fluency prediction, however, can be questioned in a more relaxed social setting, where the ratings arise usually from non-experts; indeed, everyday assessments of fluency are appraised by our social environment and encounters; these encounters due to globalisation are becoming of international nature and therefore being a non-expert has become a norm. This paper explores the latter direction, i.e., prediction of non-expert perceived speech fluency ratings, which has not been studied in the speech technology literature, to the best of our knowledge. Toward that, we investigate several approaches, namely, (a) low-level descriptor feature functionals, (b) bag-of-audio word based approach and (c) neural network based end-to-end acoustic modelling approach. Our investigations on speech data collected from 54 speakers and rated by seven non-experts demonstrate that non-expert speech fluency ratings can be systematically predicted, with the best performing system yielding a Pearson’s correlation coefficient of 0.66 and a Spearman’s correlation coefficient of 0.67 with the median human scores.","PeriodicalId":186796,"journal":{"name":"Companion Publication of the 2022 International Conference on Multimodal Interaction","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115538242","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Approbation of the Child's Emotional Development Method (CEDM) 儿童情绪发展方法(CEDM)认可
E. Lyakso, O. Frolova, E. Kleshnev, N. Ruban, A. Mekala, K. Arulalan
{"title":"Approbation of the Child's Emotional Development Method (CEDM)","authors":"E. Lyakso, O. Frolova, E. Kleshnev, N. Ruban, A. Mekala, K. Arulalan","doi":"10.1145/3536220.3563371","DOIUrl":"https://doi.org/10.1145/3536220.3563371","url":null,"abstract":"The paper presents a description of the methodological approach for assessing the formation of the emotional sphere of children aged 5-16 years with typical and atypical development. The purpose of the Child's Emotional Development Method (CEDM) is to assess the emotional development of the children by determining the ability to express their own emotions, emotions adequacy, and recognition of the emotional states of others. The approach is based on adapted methods and scales, assessment criteria and methodological approaches tested in a number of studies, depending on the age of the children, their developmental characteristics, language and cultural characteristics of the country of residence of the child. The approach includes two blocks - information about the development of the child obtaining from parents / legal representatives; methods for children testing, including interview methods, psychological tests, play situations. The assessment uses a score on the Likert scale. A pilot approbation of the methodological approach was carried out on 30 children aged 8-16 years - typically developing (TD), with autism spectrum disorders (ASD) and Down syndrome (DS). The results of the study formed the basis for the creation of the Scale of Emotional Sphere Development of Children and Adolescents.","PeriodicalId":186796,"journal":{"name":"Companion Publication of the 2022 International Conference on Multimodal Interaction","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116961753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
A Wavelet-based Approach for Multimodal Prediction of Alexithymia from Physiological Signals 基于小波的述情障碍生理信号多模态预测方法
Valeria Filippou, Nikolas Theodosiou, M. Nicolaou, E. Constantinou, G. Panayiotou, Marios Theodorou
{"title":"A Wavelet-based Approach for Multimodal Prediction of Alexithymia from Physiological Signals","authors":"Valeria Filippou, Nikolas Theodosiou, M. Nicolaou, E. Constantinou, G. Panayiotou, Marios Theodorou","doi":"10.1145/3536220.3558076","DOIUrl":"https://doi.org/10.1145/3536220.3558076","url":null,"abstract":"Alexithymia is a trait reflecting a person’s difficulty in identifying and expressing their emotions that has been linked to various forms of psychopathology. The identification of alexithymia might have therapeutic, preventive and diagnostic benefits. However, not much research has been done in proposing predictive models for alexithymia, while literature on multimodal approaches is virtually non-existent. In this light, we present, to the best of our knowledge, the first predictive framework that leverages multimodal physiological signals (heart rate, skin conductance level, facial electromyograms) to detect alexithymia. In particular, we develop a set of features that primarily capture spectral-information that is also localized in the time domain via wavelets. Subsequently, simple classifiers are utilized that can learn correlations between features extracted from all modalities. Via several experiments on a novel dataset collected via an emotion processing imagery experiment, we further show that (i) one can detect alexithymia in patients using only one stage of the experiment (elicitation of joy), and (ii) that our simpler framework outperforms compared methods, including deep networks, on the task of alexithymia detection. Our proposed method achieves an accuracy of up to 92% when using simple classifiers on specific imagery tasks. The simplicity and efficiency of our approach makes it suitable for low-powered embedded devices.","PeriodicalId":186796,"journal":{"name":"Companion Publication of the 2022 International Conference on Multimodal Interaction","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123660256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Improving Supervised Learning in Conversational Analysis through Reusing Preprocessing Data as Auxiliary Supervisors 再利用预处理数据作为辅助监督,改善会话分析中的监督学习
Joshua Y. Kim, Tongliang Liu, K. Yacef
{"title":"Improving Supervised Learning in Conversational Analysis through Reusing Preprocessing Data as Auxiliary Supervisors","authors":"Joshua Y. Kim, Tongliang Liu, K. Yacef","doi":"10.1145/3536220.3558034","DOIUrl":"https://doi.org/10.1145/3536220.3558034","url":null,"abstract":"Emotions recognition systems are trained using noisy human labels and often require heavy preprocessing during multi-modal feature extraction. Using noisy labels in single-task learning increases the risk of over-fitting. Auxiliary tasks could improve the performance of the primary task learning during the same training – multi-task learning (MTL). In this paper, we explore how the preprocessed data used for creating the textual multimodal description of the conversation, that supports conversational analysis, can be re-used as auxiliary tasks (e.g. predicting future or previous labels and predicting the ranked expressions of actions and prosody), thereby promoting the productive use of data. Our main contributions are: (1) the identification of sixteen beneficially auxiliary tasks, (2) studying the method of distributing learning capacity between the primary and auxiliary tasks, and (3) studying the relative supervision hierarchy between the primary and auxiliary tasks. Extensive experiments on IEMOCAP and SEMAINE data validate the improvements over single-task approaches, and suggest that it may generalize across multiple primary tasks.","PeriodicalId":186796,"journal":{"name":"Companion Publication of the 2022 International Conference on Multimodal Interaction","volume":"184 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122856686","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Understanding Interviewees’ Perceptions and Behaviour towards Verbally and Non-verbally Expressive Virtual Interviewing Agents 了解受访者对口头和非口头表达的虚拟面试代理人的感知和行为
Jinal D. Thakkar, Pooja S B. Rao, Kumar Shubham, Vaibhav Jain, D. Jayagopi
{"title":"Understanding Interviewees’ Perceptions and Behaviour towards Verbally and Non-verbally Expressive Virtual Interviewing Agents","authors":"Jinal D. Thakkar, Pooja S B. Rao, Kumar Shubham, Vaibhav Jain, D. Jayagopi","doi":"10.1145/3536220.3558802","DOIUrl":"https://doi.org/10.1145/3536220.3558802","url":null,"abstract":"Recent technological advancements have boosted the usage of virtual interviewing platforms where the candidates interact with a virtual interviewing agent or an avatar that has human-like behavior instead of face-to-face interviews. As a result, it is essential to understand how candidates perceive these virtual interviewing avatars and whether adding features to boost the system’s interaction makes a difference. In this work, we present the results of two studies in which a virtual interviewing avatar with verbal and non-verbal interaction capabilities was used to conduct employment interviews. We add two interactive capabilities to the avatar, namely the non-verbal gestures and the verbal follow-up questioning and compare it with a simple interviewing avatar. We analyze the differences in perception with self-rated measures and behaviour with automatically extracted audiovisual behavioural cues. The results show that the candidates speak for a longer time, feel less stressed and have a better chance to perform with verbally and non-verbally expressive virtual interviewing agents.","PeriodicalId":186796,"journal":{"name":"Companion Publication of the 2022 International Conference on Multimodal Interaction","volume":"284 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122966750","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信