基于深度残差自适应神经网络的多模态情感感知和情感识别认知计算特征提取

Gopal Arora, Munish Sabharwal, P. Kapila, Divya Paikaray, V. Vekariya, T. Narmadha
{"title":"基于深度残差自适应神经网络的多模态情感感知和情感识别认知计算特征提取","authors":"Gopal Arora, Munish Sabharwal, P. Kapila, Divya Paikaray, V. Vekariya, T. Narmadha","doi":"10.17762/ijcnis.v14i2.5507","DOIUrl":null,"url":null,"abstract":"For the healthcare framework, automatic recognition of patients’ emotions is considered to be a good facilitator. Feedback about the status of patients and satisfaction levels can be provided automatically to the stakeholders of the healthcare industry. Multimodal sentiment analysis of human is considered as the attractive and hot topic of research in artificial intelligence (AI) and is the much finer classification issue which differs from other classification issues. In cognitive science, as emotional processing procedure has inspired more, the abilities of both binary and multi-classification tasks are enhanced by splitting complex issues to simpler ones which can be handled more easily. This article proposes an automated audio-visual emotional recognition model for a healthcare industry. The model uses Deep Residual Adaptive Neural Network (DeepResANNet) for feature extraction where the scores are computed based on the differences between feature and class values of adjacent instances. Based on the output of feature extraction, positive and negative sub-nets are trained separately by the fusion module thereby improving accuracy. The proposed method is extensively evaluated using eNTERFACE’05, BAUM-2 and MOSI databases by comparing with three standard methods in terms of various parameters. As a result, DeepResANNet method achieves 97.9% of accuracy, 51.5% of RMSE, 42.5% of RAE and 44.9%of MAE in 78.9sec for eNTERFACE’05 dataset.  For BAUM-2 dataset, this model achieves 94.5% of accuracy, 46.9% of RMSE, 42.9%of RAE and 30.2% MAE in 78.9 sec. By utilizing MOSI dataset, this model achieves 82.9% of accuracy, 51.2% of RMSE, 40.1% of RAE and 37.6% of MAE in 69.2sec. By analysing all these three databases, eNTERFACE’05 is best in terms of accuracy achieving 97.9%. BAUM-2 is best in terms of error rate as it achieved 30.2 % of MAE and 46.9% of RMSE. Finally MOSI is best in terms of RAE and minimal response time by achieving 40.1% of RAE in 69.2 sec.","PeriodicalId":232613,"journal":{"name":"Int. J. Commun. Networks Inf. Secur.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Deep Residual Adaptive Neural Network Based Feature Extraction for Cognitive Computing with Multimodal Sentiment Sensing and Emotion Recognition Process\",\"authors\":\"Gopal Arora, Munish Sabharwal, P. Kapila, Divya Paikaray, V. Vekariya, T. Narmadha\",\"doi\":\"10.17762/ijcnis.v14i2.5507\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"For the healthcare framework, automatic recognition of patients’ emotions is considered to be a good facilitator. Feedback about the status of patients and satisfaction levels can be provided automatically to the stakeholders of the healthcare industry. Multimodal sentiment analysis of human is considered as the attractive and hot topic of research in artificial intelligence (AI) and is the much finer classification issue which differs from other classification issues. In cognitive science, as emotional processing procedure has inspired more, the abilities of both binary and multi-classification tasks are enhanced by splitting complex issues to simpler ones which can be handled more easily. This article proposes an automated audio-visual emotional recognition model for a healthcare industry. The model uses Deep Residual Adaptive Neural Network (DeepResANNet) for feature extraction where the scores are computed based on the differences between feature and class values of adjacent instances. Based on the output of feature extraction, positive and negative sub-nets are trained separately by the fusion module thereby improving accuracy. The proposed method is extensively evaluated using eNTERFACE’05, BAUM-2 and MOSI databases by comparing with three standard methods in terms of various parameters. As a result, DeepResANNet method achieves 97.9% of accuracy, 51.5% of RMSE, 42.5% of RAE and 44.9%of MAE in 78.9sec for eNTERFACE’05 dataset.  For BAUM-2 dataset, this model achieves 94.5% of accuracy, 46.9% of RMSE, 42.9%of RAE and 30.2% MAE in 78.9 sec. By utilizing MOSI dataset, this model achieves 82.9% of accuracy, 51.2% of RMSE, 40.1% of RAE and 37.6% of MAE in 69.2sec. By analysing all these three databases, eNTERFACE’05 is best in terms of accuracy achieving 97.9%. BAUM-2 is best in terms of error rate as it achieved 30.2 % of MAE and 46.9% of RMSE. Finally MOSI is best in terms of RAE and minimal response time by achieving 40.1% of RAE in 69.2 sec.\",\"PeriodicalId\":232613,\"journal\":{\"name\":\"Int. J. Commun. Networks Inf. Secur.\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-09-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Int. J. Commun. Networks Inf. Secur.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.17762/ijcnis.v14i2.5507\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Int. J. Commun. Networks Inf. Secur.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.17762/ijcnis.v14i2.5507","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

在医疗保健框架中,对患者情绪的自动识别被认为是一个很好的促进因素。有关患者状态和满意度水平的反馈可以自动提供给医疗保健行业的利益相关者。人类多模态情感分析是人工智能研究的热点和热点,是区别于其他分类问题的更精细的分类问题。在认知科学中,随着情绪处理过程的发展,将复杂的问题分解为更容易处理的简单问题,从而提高了二元分类和多分类任务的能力。本文提出了一种用于医疗保健行业的自动视听情感识别模型。该模型使用深度残差自适应神经网络(Deep Residual Adaptive Neural Network, DeepResANNet)进行特征提取,根据相邻实例的特征值和类值之间的差异计算得分。基于特征提取的输出,融合模块分别对正、负子网进行训练,提高了准确率。采用eNTERFACE ' 05、BAUM-2和MOSI数据库对该方法进行了广泛的评价,并与三种标准方法在各参数方面进行了比较。结果表明,对于eNTERFACE’05数据集,DeepResANNet方法在78.9秒内实现了97.9%的准确率、51.5%的RMSE、42.5%的RAE和44.9%的MAE。对于BAUM-2数据集,该模型在78.9秒内达到94.5%的准确率、46.9%的RMSE、42.9%的RAE和30.2%的MAE。对于MOSI数据集,该模型在69.2秒内达到82.9%的准确率、51.2%的RMSE、40.1%的RAE和37.6%的MAE。通过分析这三个数据库,eNTERFACE ' 05的准确率达到了97.9%。BAUM-2在错误率方面是最好的,它达到了30.2%的MAE和46.9%的RMSE。最后,MOSI在RAE和最小响应时间方面是最好的,在69.2秒内达到40.1%的RAE。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Deep Residual Adaptive Neural Network Based Feature Extraction for Cognitive Computing with Multimodal Sentiment Sensing and Emotion Recognition Process
For the healthcare framework, automatic recognition of patients’ emotions is considered to be a good facilitator. Feedback about the status of patients and satisfaction levels can be provided automatically to the stakeholders of the healthcare industry. Multimodal sentiment analysis of human is considered as the attractive and hot topic of research in artificial intelligence (AI) and is the much finer classification issue which differs from other classification issues. In cognitive science, as emotional processing procedure has inspired more, the abilities of both binary and multi-classification tasks are enhanced by splitting complex issues to simpler ones which can be handled more easily. This article proposes an automated audio-visual emotional recognition model for a healthcare industry. The model uses Deep Residual Adaptive Neural Network (DeepResANNet) for feature extraction where the scores are computed based on the differences between feature and class values of adjacent instances. Based on the output of feature extraction, positive and negative sub-nets are trained separately by the fusion module thereby improving accuracy. The proposed method is extensively evaluated using eNTERFACE’05, BAUM-2 and MOSI databases by comparing with three standard methods in terms of various parameters. As a result, DeepResANNet method achieves 97.9% of accuracy, 51.5% of RMSE, 42.5% of RAE and 44.9%of MAE in 78.9sec for eNTERFACE’05 dataset.  For BAUM-2 dataset, this model achieves 94.5% of accuracy, 46.9% of RMSE, 42.9%of RAE and 30.2% MAE in 78.9 sec. By utilizing MOSI dataset, this model achieves 82.9% of accuracy, 51.2% of RMSE, 40.1% of RAE and 37.6% of MAE in 69.2sec. By analysing all these three databases, eNTERFACE’05 is best in terms of accuracy achieving 97.9%. BAUM-2 is best in terms of error rate as it achieved 30.2 % of MAE and 46.9% of RMSE. Finally MOSI is best in terms of RAE and minimal response time by achieving 40.1% of RAE in 69.2 sec.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信