MM-EMOR:使用连接深度学习网络的社交媒体多模态情感识别

IF 3.7 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Omar Adel, Karma M.Fathalla, Ahmed Abo ElFarag
{"title":"MM-EMOR:使用连接深度学习网络的社交媒体多模态情感识别","authors":"Omar Adel, Karma M.Fathalla, Ahmed Abo ElFarag","doi":"10.3390/bdcc7040164","DOIUrl":null,"url":null,"abstract":"Emotion recognition is crucial in artificial intelligence, particularly in the domain of human–computer interaction. The ability to accurately discern and interpret emotions plays a critical role in helping machines to effectively decipher users’ underlying intentions, allowing for a more streamlined interaction process that invariably translates into an elevated user experience. The recent increase in social media usage, as well as the availability of an immense amount of unstructured data, has resulted in a significant demand for the deployment of automated emotion recognition systems. Artificial intelligence (AI) techniques have emerged as a powerful solution to this pressing concern in this context. In particular, the incorporation of multimodal AI-driven approaches for emotion recognition has proven beneficial in capturing the intricate interplay of diverse human expression cues that manifest across multiple modalities. The current study aims to develop an effective multimodal emotion recognition system known as MM-EMOR in order to improve the efficacy of emotion recognition efforts focused on audio and text modalities. The use of Mel spectrogram features, Chromagram features, and the Mobilenet Convolutional Neural Network (CNN) for processing audio data are central to the operation of this system, while an attention-based Roberta model caters to the text modality. The methodology of this study is based on an exhaustive evaluation of this approach across three different datasets. Notably, the empirical findings show that MM-EMOR outperforms competing models across the same datasets. This performance boost is noticeable, with accuracy gains of an impressive 7% on one dataset and a substantial 8% on another. Most significantly, the observed increase in accuracy for the final dataset was an astounding 18%.","PeriodicalId":36397,"journal":{"name":"Big Data and Cognitive Computing","volume":"2 1","pages":"0"},"PeriodicalIF":3.7000,"publicationDate":"2023-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"MM-EMOR: Multi-Modal Emotion Recognition of Social Media Using Concatenated Deep Learning Networks\",\"authors\":\"Omar Adel, Karma M.Fathalla, Ahmed Abo ElFarag\",\"doi\":\"10.3390/bdcc7040164\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Emotion recognition is crucial in artificial intelligence, particularly in the domain of human–computer interaction. The ability to accurately discern and interpret emotions plays a critical role in helping machines to effectively decipher users’ underlying intentions, allowing for a more streamlined interaction process that invariably translates into an elevated user experience. The recent increase in social media usage, as well as the availability of an immense amount of unstructured data, has resulted in a significant demand for the deployment of automated emotion recognition systems. Artificial intelligence (AI) techniques have emerged as a powerful solution to this pressing concern in this context. In particular, the incorporation of multimodal AI-driven approaches for emotion recognition has proven beneficial in capturing the intricate interplay of diverse human expression cues that manifest across multiple modalities. The current study aims to develop an effective multimodal emotion recognition system known as MM-EMOR in order to improve the efficacy of emotion recognition efforts focused on audio and text modalities. The use of Mel spectrogram features, Chromagram features, and the Mobilenet Convolutional Neural Network (CNN) for processing audio data are central to the operation of this system, while an attention-based Roberta model caters to the text modality. The methodology of this study is based on an exhaustive evaluation of this approach across three different datasets. Notably, the empirical findings show that MM-EMOR outperforms competing models across the same datasets. This performance boost is noticeable, with accuracy gains of an impressive 7% on one dataset and a substantial 8% on another. Most significantly, the observed increase in accuracy for the final dataset was an astounding 18%.\",\"PeriodicalId\":36397,\"journal\":{\"name\":\"Big Data and Cognitive Computing\",\"volume\":\"2 1\",\"pages\":\"0\"},\"PeriodicalIF\":3.7000,\"publicationDate\":\"2023-10-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Big Data and Cognitive Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3390/bdcc7040164\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Big Data and Cognitive Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/bdcc7040164","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

情感识别在人工智能领域,尤其是人机交互领域具有重要意义。准确识别和解释情绪的能力在帮助机器有效地破译用户潜在意图方面起着至关重要的作用,从而允许更精简的交互过程,从而始终转化为更高的用户体验。最近社交媒体使用量的增加,以及大量非结构化数据的可用性,导致了对自动情绪识别系统部署的巨大需求。在这种背景下,人工智能(AI)技术已经成为解决这一紧迫问题的有力解决方案。特别是,将多模态人工智能驱动的方法用于情感识别,已被证明有助于捕捉跨越多种模态的各种人类表达线索的复杂相互作用。本研究旨在开发一种有效的多模态情绪识别系统MM-EMOR,以提高专注于音频和文本模态的情绪识别的效率。使用Mel谱图特征、Chromagram特征和Mobilenet卷积神经网络(CNN)来处理音频数据是该系统操作的核心,而基于注意力的Roberta模型则迎合了文本模式。本研究的方法是基于对三种不同数据集的方法进行详尽的评估。值得注意的是,实证结果表明,在相同的数据集上,MM-EMOR优于竞争模型。这种性能提升是明显的,在一个数据集上获得了令人印象深刻的7%的准确性,在另一个数据集上获得了可观的8%。最重要的是,观察到最终数据集的准确性提高了惊人的18%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
MM-EMOR: Multi-Modal Emotion Recognition of Social Media Using Concatenated Deep Learning Networks
Emotion recognition is crucial in artificial intelligence, particularly in the domain of human–computer interaction. The ability to accurately discern and interpret emotions plays a critical role in helping machines to effectively decipher users’ underlying intentions, allowing for a more streamlined interaction process that invariably translates into an elevated user experience. The recent increase in social media usage, as well as the availability of an immense amount of unstructured data, has resulted in a significant demand for the deployment of automated emotion recognition systems. Artificial intelligence (AI) techniques have emerged as a powerful solution to this pressing concern in this context. In particular, the incorporation of multimodal AI-driven approaches for emotion recognition has proven beneficial in capturing the intricate interplay of diverse human expression cues that manifest across multiple modalities. The current study aims to develop an effective multimodal emotion recognition system known as MM-EMOR in order to improve the efficacy of emotion recognition efforts focused on audio and text modalities. The use of Mel spectrogram features, Chromagram features, and the Mobilenet Convolutional Neural Network (CNN) for processing audio data are central to the operation of this system, while an attention-based Roberta model caters to the text modality. The methodology of this study is based on an exhaustive evaluation of this approach across three different datasets. Notably, the empirical findings show that MM-EMOR outperforms competing models across the same datasets. This performance boost is noticeable, with accuracy gains of an impressive 7% on one dataset and a substantial 8% on another. Most significantly, the observed increase in accuracy for the final dataset was an astounding 18%.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Big Data and Cognitive Computing
Big Data and Cognitive Computing Business, Management and Accounting-Management Information Systems
CiteScore
7.10
自引率
8.10%
发文量
128
审稿时长
11 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信