{"title":"一种结合表情和手势的视频意见分类方法","authors":"Airton Gaio Junior, E. Santos","doi":"10.1109/SIBGRAPI.2018.00011","DOIUrl":null,"url":null,"abstract":"Most of the researches dealing with video-based opinion recognition problems employ the combination of data from three different sources: video, audio and text. As a consequence, they are solutions based on complex and language-dependent models. Besides such complexity, it may be observed that these current solutions attain low performance in practical applications. Focusing on overcoming these drawbacks, this work presents a method for opinion classification that uses only video as data source, more precisely, facial expression and body gesture information are extracted from online videos and combined to lead to higher classification rates. The proposed method uses feature encoding strategies to improve data representation and to facilitate the classification task in order to predict user's opinion with high accuracy and independently of the language used in videos. Experiments were carried out using three public databases and three baselines to test the proposed method. The results of these experiments show that, even performing only visual analysis of the videos, the proposed method achieves 16% higher accuracy and precision rates, when compared to baselines that analyze visual, audio and textual data video. Moreover, it is showed that the proposed method may identify emotions in videos whose language is other than the language used for training.","PeriodicalId":208985,"journal":{"name":"2018 31st SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Method for Opinion Classification in Video Combining Facial Expressions and Gestures\",\"authors\":\"Airton Gaio Junior, E. Santos\",\"doi\":\"10.1109/SIBGRAPI.2018.00011\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Most of the researches dealing with video-based opinion recognition problems employ the combination of data from three different sources: video, audio and text. As a consequence, they are solutions based on complex and language-dependent models. Besides such complexity, it may be observed that these current solutions attain low performance in practical applications. Focusing on overcoming these drawbacks, this work presents a method for opinion classification that uses only video as data source, more precisely, facial expression and body gesture information are extracted from online videos and combined to lead to higher classification rates. The proposed method uses feature encoding strategies to improve data representation and to facilitate the classification task in order to predict user's opinion with high accuracy and independently of the language used in videos. Experiments were carried out using three public databases and three baselines to test the proposed method. The results of these experiments show that, even performing only visual analysis of the videos, the proposed method achieves 16% higher accuracy and precision rates, when compared to baselines that analyze visual, audio and textual data video. Moreover, it is showed that the proposed method may identify emotions in videos whose language is other than the language used for training.\",\"PeriodicalId\":208985,\"journal\":{\"name\":\"2018 31st SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 31st SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SIBGRAPI.2018.00011\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 31st SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SIBGRAPI.2018.00011","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Method for Opinion Classification in Video Combining Facial Expressions and Gestures
Most of the researches dealing with video-based opinion recognition problems employ the combination of data from three different sources: video, audio and text. As a consequence, they are solutions based on complex and language-dependent models. Besides such complexity, it may be observed that these current solutions attain low performance in practical applications. Focusing on overcoming these drawbacks, this work presents a method for opinion classification that uses only video as data source, more precisely, facial expression and body gesture information are extracted from online videos and combined to lead to higher classification rates. The proposed method uses feature encoding strategies to improve data representation and to facilitate the classification task in order to predict user's opinion with high accuracy and independently of the language used in videos. Experiments were carried out using three public databases and three baselines to test the proposed method. The results of these experiments show that, even performing only visual analysis of the videos, the proposed method achieves 16% higher accuracy and precision rates, when compared to baselines that analyze visual, audio and textual data video. Moreover, it is showed that the proposed method may identify emotions in videos whose language is other than the language used for training.