基于规则的决策级融合的视听数据情感识别

2016 IEEE Students’ Technology Symposium (TechSym) Pub Date : 2016-09-01 DOI:10.1109/TECHSYM.2016.7872646

Subhasmita Sahoo, A. Routray

{"title":"基于规则的决策级融合的视听数据情感识别","authors":"Subhasmita Sahoo, A. Routray","doi":"10.1109/TECHSYM.2016.7872646","DOIUrl":null,"url":null,"abstract":"Emotion recognition systems aim at identifying emotions of human subjects from underlying data with acceptable accuracy. Audio and visual signals, being the primary modalities of human emotion perception, have attained the most attention in developing intelligent systems for natural interaction. The emotion recognition system must automatically identify the human emotional states from his or her voice and facial image, unaffected by all possible constraints. In this work, an audio-visual emotion recognition system has been developed that uses fusion of both the modalities at the decision level. At first, separate emotion recognition systems that use speech and facial expressions were developed and tested separately. The speech emotion recognition system was tested on two standard speech emotion databases: Berlin EMODB database and Assamese database. The efficiency of visual emotion recognition system was analyzed using the eNTREFACE'05 database. Then a decision rule was set for fusion of both audio and visual information at the decision level to identify emotions. The proposed multi-modal system has been tested on the same eNTERFACE'05 database.","PeriodicalId":403350,"journal":{"name":"2016 IEEE Students’ Technology Symposium (TechSym)","volume":"C-22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"24","resultStr":"{\"title\":\"Emotion recognition from audio-visual data using rule based decision level fusion\",\"authors\":\"Subhasmita Sahoo, A. Routray\",\"doi\":\"10.1109/TECHSYM.2016.7872646\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Emotion recognition systems aim at identifying emotions of human subjects from underlying data with acceptable accuracy. Audio and visual signals, being the primary modalities of human emotion perception, have attained the most attention in developing intelligent systems for natural interaction. The emotion recognition system must automatically identify the human emotional states from his or her voice and facial image, unaffected by all possible constraints. In this work, an audio-visual emotion recognition system has been developed that uses fusion of both the modalities at the decision level. At first, separate emotion recognition systems that use speech and facial expressions were developed and tested separately. The speech emotion recognition system was tested on two standard speech emotion databases: Berlin EMODB database and Assamese database. The efficiency of visual emotion recognition system was analyzed using the eNTREFACE'05 database. Then a decision rule was set for fusion of both audio and visual information at the decision level to identify emotions. The proposed multi-modal system has been tested on the same eNTERFACE'05 database.\",\"PeriodicalId\":403350,\"journal\":{\"name\":\"2016 IEEE Students’ Technology Symposium (TechSym)\",\"volume\":\"C-22 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"24\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE Students’ Technology Symposium (TechSym)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/TECHSYM.2016.7872646\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE Students’ Technology Symposium (TechSym)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TECHSYM.2016.7872646","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 24

摘要

情绪识别系统旨在以可接受的准确性从基础数据中识别人类受试者的情绪。音频和视觉信号作为人类情感感知的主要形式，在开发用于自然交互的智能系统中受到了最多的关注。情绪识别系统必须不受任何可能的约束，从他或她的声音和面部图像中自动识别人类的情绪状态。在这项工作中，开发了一种视听情感识别系统，该系统在决策层面使用了两种模式的融合。起初，使用语音和面部表情的独立情感识别系统被分别开发和测试。在两个标准的语音情感数据库:柏林EMODB数据库和阿萨姆邦数据库上对语音情感识别系统进行了测试。利用eNTREFACE'05数据库对视觉情感识别系统的效率进行了分析。然后建立决策规则，在决策层面融合视听信息进行情绪识别。所提出的多模态系统已在同一个eNTERFACE'05数据库上进行了测试。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Emotion recognition from audio-visual data using rule based decision level fusion

Emotion recognition systems aim at identifying emotions of human subjects from underlying data with acceptable accuracy. Audio and visual signals, being the primary modalities of human emotion perception, have attained the most attention in developing intelligent systems for natural interaction. The emotion recognition system must automatically identify the human emotional states from his or her voice and facial image, unaffected by all possible constraints. In this work, an audio-visual emotion recognition system has been developed that uses fusion of both the modalities at the decision level. At first, separate emotion recognition systems that use speech and facial expressions were developed and tested separately. The speech emotion recognition system was tested on two standard speech emotion databases: Berlin EMODB database and Assamese database. The efficiency of visual emotion recognition system was analyzed using the eNTREFACE'05 database. Then a decision rule was set for fusion of both audio and visual information at the decision level to identify emotions. The proposed multi-modal system has been tested on the same eNTERFACE'05 database.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2016 IEEE Students’ Technology Symposium (TechSym)

自引率

0.00%

发文量