{"title":"基于规则的决策级融合的视听数据情感识别","authors":"Subhasmita Sahoo, A. Routray","doi":"10.1109/TECHSYM.2016.7872646","DOIUrl":null,"url":null,"abstract":"Emotion recognition systems aim at identifying emotions of human subjects from underlying data with acceptable accuracy. Audio and visual signals, being the primary modalities of human emotion perception, have attained the most attention in developing intelligent systems for natural interaction. The emotion recognition system must automatically identify the human emotional states from his or her voice and facial image, unaffected by all possible constraints. In this work, an audio-visual emotion recognition system has been developed that uses fusion of both the modalities at the decision level. At first, separate emotion recognition systems that use speech and facial expressions were developed and tested separately. The speech emotion recognition system was tested on two standard speech emotion databases: Berlin EMODB database and Assamese database. The efficiency of visual emotion recognition system was analyzed using the eNTREFACE'05 database. Then a decision rule was set for fusion of both audio and visual information at the decision level to identify emotions. The proposed multi-modal system has been tested on the same eNTERFACE'05 database.","PeriodicalId":403350,"journal":{"name":"2016 IEEE Students’ Technology Symposium (TechSym)","volume":"C-22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"24","resultStr":"{\"title\":\"Emotion recognition from audio-visual data using rule based decision level fusion\",\"authors\":\"Subhasmita Sahoo, A. Routray\",\"doi\":\"10.1109/TECHSYM.2016.7872646\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Emotion recognition systems aim at identifying emotions of human subjects from underlying data with acceptable accuracy. Audio and visual signals, being the primary modalities of human emotion perception, have attained the most attention in developing intelligent systems for natural interaction. The emotion recognition system must automatically identify the human emotional states from his or her voice and facial image, unaffected by all possible constraints. In this work, an audio-visual emotion recognition system has been developed that uses fusion of both the modalities at the decision level. At first, separate emotion recognition systems that use speech and facial expressions were developed and tested separately. The speech emotion recognition system was tested on two standard speech emotion databases: Berlin EMODB database and Assamese database. The efficiency of visual emotion recognition system was analyzed using the eNTREFACE'05 database. Then a decision rule was set for fusion of both audio and visual information at the decision level to identify emotions. The proposed multi-modal system has been tested on the same eNTERFACE'05 database.\",\"PeriodicalId\":403350,\"journal\":{\"name\":\"2016 IEEE Students’ Technology Symposium (TechSym)\",\"volume\":\"C-22 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"24\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE Students’ Technology Symposium (TechSym)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/TECHSYM.2016.7872646\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE Students’ Technology Symposium (TechSym)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TECHSYM.2016.7872646","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Emotion recognition from audio-visual data using rule based decision level fusion
Emotion recognition systems aim at identifying emotions of human subjects from underlying data with acceptable accuracy. Audio and visual signals, being the primary modalities of human emotion perception, have attained the most attention in developing intelligent systems for natural interaction. The emotion recognition system must automatically identify the human emotional states from his or her voice and facial image, unaffected by all possible constraints. In this work, an audio-visual emotion recognition system has been developed that uses fusion of both the modalities at the decision level. At first, separate emotion recognition systems that use speech and facial expressions were developed and tested separately. The speech emotion recognition system was tested on two standard speech emotion databases: Berlin EMODB database and Assamese database. The efficiency of visual emotion recognition system was analyzed using the eNTREFACE'05 database. Then a decision rule was set for fusion of both audio and visual information at the decision level to identify emotions. The proposed multi-modal system has been tested on the same eNTERFACE'05 database.