基于空间语境与面部表情时间动态解耦的情绪识别

2019 International Symposium on Networks, Computers and Communications (ISNCC) Pub Date : 2019-06-01 DOI:10.1109/ISNCC.2019.8909141

R. Alazrai, K. M. Yousef, M. Daoud

{"title":"基于空间语境与面部表情时间动态解耦的情绪识别","authors":"R. Alazrai, K. M. Yousef, M. Daoud","doi":"10.1109/ISNCC.2019.8909141","DOIUrl":null,"url":null,"abstract":"This paper presents an emotion recognition approach based on decoupling the spatial context from the temporal dynamics of facial expressions in video sequences. In particular, each emotional state is represented as a set of temporal phases, where each temporal phase exhibits different temporal dynamics such as the expressing speed and the variable length of each phase. In this work, we have developed an algorithm for automatically detecting the temporal phases of human facial expressions by employing the concept of mutual information to define a similarity measure among different video frames. Moreover, we have developed a two-layer framework for emotional state recognition. The first layer utilizes the spatial context to classify the frames in an input video into emotional-specific temporal phases using a support vector machine classifier. In the second layer, dynamic time warping is used to classify the sequence of labels associated with the video frames, which is generated in the first layer, into a specific emotional state. In order to validate the performance of the proposed approach, we have conducted extensive computer simulations and the results show an average classification accuracy of 93.53% using the extended Cohn-Kanade facial-expression database.","PeriodicalId":187178,"journal":{"name":"2019 International Symposium on Networks, Computers and Communications (ISNCC)","volume":"70 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Emotion Recognition Based on Decoupling the Spatial Context from the Temporal Dynamics of Facial Expressions\",\"authors\":\"R. Alazrai, K. M. Yousef, M. Daoud\",\"doi\":\"10.1109/ISNCC.2019.8909141\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents an emotion recognition approach based on decoupling the spatial context from the temporal dynamics of facial expressions in video sequences. In particular, each emotional state is represented as a set of temporal phases, where each temporal phase exhibits different temporal dynamics such as the expressing speed and the variable length of each phase. In this work, we have developed an algorithm for automatically detecting the temporal phases of human facial expressions by employing the concept of mutual information to define a similarity measure among different video frames. Moreover, we have developed a two-layer framework for emotional state recognition. The first layer utilizes the spatial context to classify the frames in an input video into emotional-specific temporal phases using a support vector machine classifier. In the second layer, dynamic time warping is used to classify the sequence of labels associated with the video frames, which is generated in the first layer, into a specific emotional state. In order to validate the performance of the proposed approach, we have conducted extensive computer simulations and the results show an average classification accuracy of 93.53% using the extended Cohn-Kanade facial-expression database.\",\"PeriodicalId\":187178,\"journal\":{\"name\":\"2019 International Symposium on Networks, Computers and Communications (ISNCC)\",\"volume\":\"70 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 International Symposium on Networks, Computers and Communications (ISNCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISNCC.2019.8909141\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Symposium on Networks, Computers and Communications (ISNCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISNCC.2019.8909141","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

本文提出了一种基于视频序列中面部表情的时空动态解耦的情感识别方法。特别是，每一种情绪状态都表现为一组时间阶段，其中每个时间阶段表现出不同的时间动态，如表达速度和每个阶段的可变长度。在这项工作中，我们开发了一种算法，通过使用互信息的概念来定义不同视频帧之间的相似性度量，自动检测人类面部表情的时间相位。此外，我们还开发了一个用于情绪状态识别的双层框架。第一层利用空间上下文，使用支持向量机分类器将输入视频中的帧分类为特定于情感的时间阶段。在第二层，动态时间扭曲用于将第一层生成的与视频帧相关的标签序列分类为特定的情感状态。为了验证所提出方法的性能，我们进行了大量的计算机模拟，结果显示使用扩展的科恩-卡纳德面部表情数据库的平均分类准确率为93.53%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Emotion Recognition Based on Decoupling the Spatial Context from the Temporal Dynamics of Facial Expressions

This paper presents an emotion recognition approach based on decoupling the spatial context from the temporal dynamics of facial expressions in video sequences. In particular, each emotional state is represented as a set of temporal phases, where each temporal phase exhibits different temporal dynamics such as the expressing speed and the variable length of each phase. In this work, we have developed an algorithm for automatically detecting the temporal phases of human facial expressions by employing the concept of mutual information to define a similarity measure among different video frames. Moreover, we have developed a two-layer framework for emotional state recognition. The first layer utilizes the spatial context to classify the frames in an input video into emotional-specific temporal phases using a support vector machine classifier. In the second layer, dynamic time warping is used to classify the sequence of labels associated with the video frames, which is generated in the first layer, into a specific emotional state. In order to validate the performance of the proposed approach, we have conducted extensive computer simulations and the results show an average classification accuracy of 93.53% using the extended Cohn-Kanade facial-expression database.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2019 International Symposium on Networks, Computers and Communications (ISNCC)

自引率

0.00%

发文量