{"title":"利用语义和动作特征的 U 型分布引导手语情绪识别","authors":"Jiangtao Zhang;Qingshan Wang;Qi Wang","doi":"10.1109/TAFFC.2024.3409357","DOIUrl":null,"url":null,"abstract":"Emotional expression is a bridge to human communication, especially for the hearing impaired. This paper proposes a sign language emotion recognition method based on semantic and movement features by exploring the relationship between emotion valence and arousal in-depth, called SeMER. The SeMER framework includes a semantic extractor, a movement feature extractor, and an emotion classifier. The contextual relations obtained from the sign language recognition task are added to the semantic extractor as prior knowledge using a transfer learning approach to better acquire the affective polarity of semantics. In the movement feature extractor based on graph convolutional networks, a spatial-temporal adjacency matrix of gestures and node attention matrix are developed to aggregate the emotion-related movement features of intra- and inter-gestures. The proposed emotion classifier maps semantic and movement features to the emotion space. The validated U-shaped distributions of valance and arousal are then used to guide the relationship between them, and improve the accuracy of emotion prediction. In addition, a sign language emotion dataset containing 5 emotions from 18 participants, SE-Sentence, is collected through armbands with built-in surface electromyograph and inertial measurement unit sensors. Experimental results showed that SeMER achieved an accuracy and f1 value of 88% on SE-Sentence.","PeriodicalId":13131,"journal":{"name":"IEEE Transactions on Affective Computing","volume":"15 4","pages":"2180-2191"},"PeriodicalIF":9.6000,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"U-Shaped Distribution Guided Sign Language Emotion Recognition With Semantic and Movement Features\",\"authors\":\"Jiangtao Zhang;Qingshan Wang;Qi Wang\",\"doi\":\"10.1109/TAFFC.2024.3409357\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Emotional expression is a bridge to human communication, especially for the hearing impaired. This paper proposes a sign language emotion recognition method based on semantic and movement features by exploring the relationship between emotion valence and arousal in-depth, called SeMER. The SeMER framework includes a semantic extractor, a movement feature extractor, and an emotion classifier. The contextual relations obtained from the sign language recognition task are added to the semantic extractor as prior knowledge using a transfer learning approach to better acquire the affective polarity of semantics. In the movement feature extractor based on graph convolutional networks, a spatial-temporal adjacency matrix of gestures and node attention matrix are developed to aggregate the emotion-related movement features of intra- and inter-gestures. The proposed emotion classifier maps semantic and movement features to the emotion space. The validated U-shaped distributions of valance and arousal are then used to guide the relationship between them, and improve the accuracy of emotion prediction. In addition, a sign language emotion dataset containing 5 emotions from 18 participants, SE-Sentence, is collected through armbands with built-in surface electromyograph and inertial measurement unit sensors. Experimental results showed that SeMER achieved an accuracy and f1 value of 88% on SE-Sentence.\",\"PeriodicalId\":13131,\"journal\":{\"name\":\"IEEE Transactions on Affective Computing\",\"volume\":\"15 4\",\"pages\":\"2180-2191\"},\"PeriodicalIF\":9.6000,\"publicationDate\":\"2024-06-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Affective Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10547361/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Affective Computing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10547361/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
摘要
情感表达是人类沟通的桥梁,对于听障人士尤其如此。本文通过深入探讨情绪价位和唤醒之间的关系,提出了一种基于语义和动作特征的手语情绪识别方法,称为 SeMER。SeMER 框架包括语义提取器、动作特征提取器和情绪分类器。语义提取器采用迁移学习方法,将手语识别任务中获得的上下文关系作为先验知识添加到语义提取器中,以更好地获取语义的情感极性。在基于图卷积网络的运动特征提取器中,开发了手势的时空邻接矩阵和节点注意矩阵,以聚合手势内和手势间与情感相关的运动特征。所提出的情感分类器将语义和动作特征映射到情感空间。经过验证的情绪和唤醒的 U 型分布用于指导它们之间的关系,并提高情绪预测的准确性。此外,还通过内置表面肌电图和惯性测量单元传感器的臂章收集了来自 18 名参与者的包含 5 种情绪的手语情绪数据集 SE-Sentence。实验结果表明,SeMER 在 SE-Sentence 上达到了 88% 的准确率和 f1 值。
U-Shaped Distribution Guided Sign Language Emotion Recognition With Semantic and Movement Features
Emotional expression is a bridge to human communication, especially for the hearing impaired. This paper proposes a sign language emotion recognition method based on semantic and movement features by exploring the relationship between emotion valence and arousal in-depth, called SeMER. The SeMER framework includes a semantic extractor, a movement feature extractor, and an emotion classifier. The contextual relations obtained from the sign language recognition task are added to the semantic extractor as prior knowledge using a transfer learning approach to better acquire the affective polarity of semantics. In the movement feature extractor based on graph convolutional networks, a spatial-temporal adjacency matrix of gestures and node attention matrix are developed to aggregate the emotion-related movement features of intra- and inter-gestures. The proposed emotion classifier maps semantic and movement features to the emotion space. The validated U-shaped distributions of valance and arousal are then used to guide the relationship between them, and improve the accuracy of emotion prediction. In addition, a sign language emotion dataset containing 5 emotions from 18 participants, SE-Sentence, is collected through armbands with built-in surface electromyograph and inertial measurement unit sensors. Experimental results showed that SeMER achieved an accuracy and f1 value of 88% on SE-Sentence.
期刊介绍:
The IEEE Transactions on Affective Computing is an international and interdisciplinary journal. Its primary goal is to share research findings on the development of systems capable of recognizing, interpreting, and simulating human emotions and related affective phenomena. The journal publishes original research on the underlying principles and theories that explain how and why affective factors shape human-technology interactions. It also focuses on how techniques for sensing and simulating affect can enhance our understanding of human emotions and processes. Additionally, the journal explores the design, implementation, and evaluation of systems that prioritize the consideration of affect in their usability. We also welcome surveys of existing work that provide new perspectives on the historical and future directions of this field.