A. Man, Yuanyuan Pu, Dan Xu, Wenhua Qian, Zhengpeng Zhao, Qiuxia Yang
{"title":"Multi-Feature Fusion for Multimodal Attentive Sentiment Analysis","authors":"A. Man, Yuanyuan Pu, Dan Xu, Wenhua Qian, Zhengpeng Zhao, Qiuxia Yang","doi":"10.1145/3338533.3366591","DOIUrl":null,"url":null,"abstract":"Sentiment analysis has been an interesting and challenging task, researchers mostly pay attention to single-modal (image or text) emotion recognition, less attention is paid to joint analysis of multi-modal data. Most existing multi-modal sentiment analysis algorithms combined with attention mechanism focus only on local area of images, ignore the emotional information provided by the global features of the image. Motivated by the research status quo, in this paper, we proposed a novel multi-modal sentiment analysis model, which focuses on local attentive feature also on the global contextual feature from image, then a novel feature fusion mechanism is utilized to fuse features from different modal. In our proposed model, we use a convolutional neural network (CNN) to extract the region maps of images, and use the attention mechanism to acquire attention coefficient, then use a CNN with fewer hidden layers to extract the global feature, a long-short term memory model (LSTM) is utilized to extract textual feature. Finally, a tensor fusion network (TFN) is utilized to fuse all features from different modal. Extensive experiments are conducted on both weakly labeled and manually labeled datasets, and the results demonstrate the superiority of the proposed method.","PeriodicalId":273086,"journal":{"name":"Proceedings of the ACM Multimedia Asia","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ACM Multimedia Asia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3338533.3366591","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Sentiment analysis has been an interesting and challenging task, researchers mostly pay attention to single-modal (image or text) emotion recognition, less attention is paid to joint analysis of multi-modal data. Most existing multi-modal sentiment analysis algorithms combined with attention mechanism focus only on local area of images, ignore the emotional information provided by the global features of the image. Motivated by the research status quo, in this paper, we proposed a novel multi-modal sentiment analysis model, which focuses on local attentive feature also on the global contextual feature from image, then a novel feature fusion mechanism is utilized to fuse features from different modal. In our proposed model, we use a convolutional neural network (CNN) to extract the region maps of images, and use the attention mechanism to acquire attention coefficient, then use a CNN with fewer hidden layers to extract the global feature, a long-short term memory model (LSTM) is utilized to extract textual feature. Finally, a tensor fusion network (TFN) is utilized to fuse all features from different modal. Extensive experiments are conducted on both weakly labeled and manually labeled datasets, and the results demonstrate the superiority of the proposed method.