{"title":"多模态情感分析:多任务学习方法","authors":"M. Fortin, B. Chaib-draa","doi":"10.5220/0007313503680376","DOIUrl":null,"url":null,"abstract":"Multimodal sentiment analysis has recently received an increasing interest. However, most methods have considered that text and image modalities are always available at test time. This assumption is often violated in real environments (e.g. social media) since users do not always publish a text with an image. In this paper we propose a method based on a multitask framework to combine multimodal information when it is available, while being able to handle the cases where a modality is missing. Our model contains one classifier for analyzing the text, another for analyzing the image, and another performing the prediction by fusing both modalities. In addition to offer a solution to the problem of a missing modality, our experiments show that this multitask framework improves generalization by acting as a regularization mechanism. We also demonstrate that the model can handle a missing modality at training time, thus being able to be trained with image-only and text-only examples.","PeriodicalId":410036,"journal":{"name":"International Conference on Pattern Recognition Applications and Methods","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":"{\"title\":\"Multimodal Sentiment Analysis: A Multitask Learning Approach\",\"authors\":\"M. Fortin, B. Chaib-draa\",\"doi\":\"10.5220/0007313503680376\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Multimodal sentiment analysis has recently received an increasing interest. However, most methods have considered that text and image modalities are always available at test time. This assumption is often violated in real environments (e.g. social media) since users do not always publish a text with an image. In this paper we propose a method based on a multitask framework to combine multimodal information when it is available, while being able to handle the cases where a modality is missing. Our model contains one classifier for analyzing the text, another for analyzing the image, and another performing the prediction by fusing both modalities. In addition to offer a solution to the problem of a missing modality, our experiments show that this multitask framework improves generalization by acting as a regularization mechanism. We also demonstrate that the model can handle a missing modality at training time, thus being able to be trained with image-only and text-only examples.\",\"PeriodicalId\":410036,\"journal\":{\"name\":\"International Conference on Pattern Recognition Applications and Methods\",\"volume\":\"6 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-02-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"20\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Conference on Pattern Recognition Applications and Methods\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5220/0007313503680376\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Pattern Recognition Applications and Methods","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5220/0007313503680376","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Multimodal Sentiment Analysis: A Multitask Learning Approach
Multimodal sentiment analysis has recently received an increasing interest. However, most methods have considered that text and image modalities are always available at test time. This assumption is often violated in real environments (e.g. social media) since users do not always publish a text with an image. In this paper we propose a method based on a multitask framework to combine multimodal information when it is available, while being able to handle the cases where a modality is missing. Our model contains one classifier for analyzing the text, another for analyzing the image, and another performing the prediction by fusing both modalities. In addition to offer a solution to the problem of a missing modality, our experiments show that this multitask framework improves generalization by acting as a regularization mechanism. We also demonstrate that the model can handle a missing modality at training time, thus being able to be trained with image-only and text-only examples.