基于多模态深度学习的网民情感识别研究

Nan Jia, Tianhao Yao
{"title":"基于多模态深度学习的网民情感识别研究","authors":"Nan Jia, Tianhao Yao","doi":"10.1117/12.3032110","DOIUrl":null,"url":null,"abstract":"With the rapid popularity of social media and the Internet, network security issues are becoming increasingly prominent. More and more people are accustomed to expressing their emotions and opinions online, and the expression of netizens’ emotions is becoming more and more diversified. Accurate analysis of netizens’ emotions is particularly important. Traditional emotion recognition methods are mainly based on text analysis, but with the diversification of network media, single text analysis has been unable to meet the actual needs. Therefore, continuously exploring the application of multimodal deep learning in netizen emotion recognition has become an inevitable choice for public security organs. This paper aims to explore the application of multimodal deep learning in netizen emotion recognition research. Therefore, this study uses multimodal datasets of text and images, and constructs BERT and VGG-16(fine-tuning) models to extract emotional features from text mode and image mode respectively. By introducing the multi-head attention mechanism, the two modes are combined to establish a fusion model, and explores how to combine them to improve classification performance. The final accuracy of text modality is 0.70, the accuracy of image modality is 0.58, and the accuracy of multimodal fusion model is 0.73, which is 0.03 and 0.15 higher than that of text modality and image modality, respectively, proving the scientific nature of multimodal fusion model. It can provide new ideas and methods for the analysis and early warning of public security organs, and also provide reference and inspiration for the research in other fields.","PeriodicalId":342847,"journal":{"name":"International Conference on Algorithms, Microchips and Network Applications","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Research on netizen sentiment recognition based on multimodal deep learning\",\"authors\":\"Nan Jia, Tianhao Yao\",\"doi\":\"10.1117/12.3032110\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the rapid popularity of social media and the Internet, network security issues are becoming increasingly prominent. More and more people are accustomed to expressing their emotions and opinions online, and the expression of netizens’ emotions is becoming more and more diversified. Accurate analysis of netizens’ emotions is particularly important. Traditional emotion recognition methods are mainly based on text analysis, but with the diversification of network media, single text analysis has been unable to meet the actual needs. Therefore, continuously exploring the application of multimodal deep learning in netizen emotion recognition has become an inevitable choice for public security organs. This paper aims to explore the application of multimodal deep learning in netizen emotion recognition research. Therefore, this study uses multimodal datasets of text and images, and constructs BERT and VGG-16(fine-tuning) models to extract emotional features from text mode and image mode respectively. By introducing the multi-head attention mechanism, the two modes are combined to establish a fusion model, and explores how to combine them to improve classification performance. The final accuracy of text modality is 0.70, the accuracy of image modality is 0.58, and the accuracy of multimodal fusion model is 0.73, which is 0.03 and 0.15 higher than that of text modality and image modality, respectively, proving the scientific nature of multimodal fusion model. It can provide new ideas and methods for the analysis and early warning of public security organs, and also provide reference and inspiration for the research in other fields.\",\"PeriodicalId\":342847,\"journal\":{\"name\":\"International Conference on Algorithms, Microchips and Network Applications\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-06-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Conference on Algorithms, Microchips and Network Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1117/12.3032110\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Algorithms, Microchips and Network Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1117/12.3032110","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

随着社交媒体和互联网的迅速普及,网络安全问题日益突出。越来越多的人习惯在网上表达自己的情绪和观点,网民情绪的表达方式也越来越多样化。对网民情绪的准确分析显得尤为重要。传统的情感识别方法主要基于文本分析,但随着网络媒体的多样化,单一的文本分析已经不能满足实际需要。因此,不断探索多模态深度学习在网民情绪识别中的应用已成为公安机关的必然选择。本文旨在探索多模态深度学习在网民情感识别研究中的应用。因此,本研究使用文本和图像的多模态数据集,构建了BERT和VGG-16(微调)模型,分别从文本模式和图像模式中提取情感特征。通过引入多头关注机制,将两种模式结合起来建立融合模型,并探索如何将它们结合起来以提高分类性能。最终文本模式的准确率为 0.70,图像模式的准确率为 0.58,多模态融合模型的准确率为 0.73,分别比文本模式和图像模式的准确率高出 0.03 和 0.15,证明了多模态融合模型的科学性。它可以为公安机关的分析预警提供新的思路和方法,也可以为其他领域的研究提供借鉴和启发。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Research on netizen sentiment recognition based on multimodal deep learning
With the rapid popularity of social media and the Internet, network security issues are becoming increasingly prominent. More and more people are accustomed to expressing their emotions and opinions online, and the expression of netizens’ emotions is becoming more and more diversified. Accurate analysis of netizens’ emotions is particularly important. Traditional emotion recognition methods are mainly based on text analysis, but with the diversification of network media, single text analysis has been unable to meet the actual needs. Therefore, continuously exploring the application of multimodal deep learning in netizen emotion recognition has become an inevitable choice for public security organs. This paper aims to explore the application of multimodal deep learning in netizen emotion recognition research. Therefore, this study uses multimodal datasets of text and images, and constructs BERT and VGG-16(fine-tuning) models to extract emotional features from text mode and image mode respectively. By introducing the multi-head attention mechanism, the two modes are combined to establish a fusion model, and explores how to combine them to improve classification performance. The final accuracy of text modality is 0.70, the accuracy of image modality is 0.58, and the accuracy of multimodal fusion model is 0.73, which is 0.03 and 0.15 higher than that of text modality and image modality, respectively, proving the scientific nature of multimodal fusion model. It can provide new ideas and methods for the analysis and early warning of public security organs, and also provide reference and inspiration for the research in other fields.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信