{"title":"Evaluation of Fusion Techniques for Multi-modal Sentiment Analysis","authors":"Rishabh Shinde, Pallavi Udatewar, Amruta Nandargi, Siddarth Mohan, Ranjana Agrawal, Pankaj Nirale","doi":"10.1109/ASSIC55218.2022.10088291","DOIUrl":null,"url":null,"abstract":"Sentiment Analysis a subset of Affective Computing is often categorized as a Natural Language Processing task and is restricted to the textual modality. Since the world around us is multimodal, i.e., we see things, listen to sounds, and feel the various textures of objects, sentiment analysis must be applied to the different modalities present in our daily lives. In this paper, we have implemented sentiment analysis on the following two modalities - text and image. The study compares the performance of individual single-modal models to the performance of a multimodal model for the task of sentiment analysis. This study employs the use of a functional RNN model for textual sentiment analysis and a functional CNN model for visual sentiment analysis. Multimodality is achieved by performing fusion. Additionally, a comparison of two types of fusion is explored, namely Intermediate fusion and Late fusion. There is an improvement from previous studies that is evident from the experimental results where our fusion model gives an accuracy of 79.63%. The promising results from the study will prove to be helpful for budding researchers in exploring prospects in the field of multimodality and affective domain.","PeriodicalId":441406,"journal":{"name":"2022 International Conference on Advancements in Smart, Secure and Intelligent Computing (ASSIC)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Advancements in Smart, Secure and Intelligent Computing (ASSIC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASSIC55218.2022.10088291","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Sentiment Analysis a subset of Affective Computing is often categorized as a Natural Language Processing task and is restricted to the textual modality. Since the world around us is multimodal, i.e., we see things, listen to sounds, and feel the various textures of objects, sentiment analysis must be applied to the different modalities present in our daily lives. In this paper, we have implemented sentiment analysis on the following two modalities - text and image. The study compares the performance of individual single-modal models to the performance of a multimodal model for the task of sentiment analysis. This study employs the use of a functional RNN model for textual sentiment analysis and a functional CNN model for visual sentiment analysis. Multimodality is achieved by performing fusion. Additionally, a comparison of two types of fusion is explored, namely Intermediate fusion and Late fusion. There is an improvement from previous studies that is evident from the experimental results where our fusion model gives an accuracy of 79.63%. The promising results from the study will prove to be helpful for budding researchers in exploring prospects in the field of multimodality and affective domain.