{"title":"视频响应的多模态情感识别(扩展摘要)","authors":"M. Soleymani, M. Pantic, T. Pun","doi":"10.1109/ACII.2015.7344615","DOIUrl":null,"url":null,"abstract":"We present a user-independent emotion recognition method with the goal of detecting expected emotions or affective tags for videos using electroencephalogram (EEG), pupillary response and gaze distance. We first selected 20 video clips with extrinsic emotional content from movies and online resources. Then EEG responses and eye gaze data were recorded from 24 participants while watching emotional video clips. Ground truth was defined based on the median arousal and valence scores given to clips in a preliminary study. The arousal classes were calm, medium aroused and activated and the valence classes were unpleasant, neutral and pleasant. A one-participant-out cross validation was employed to evaluate the classification performance in a user-independent approach. The best classification accuracy of 68.5% for three labels of valence and 76.4% for three labels of arousal were obtained using a modality fusion strategy and a support vector machine. The results over a population of 24 participants demonstrate that user-independent emotion recognition can outperform individual self-reports for arousal assessments and do not underperform for valence assessments.","PeriodicalId":6863,"journal":{"name":"2015 International Conference on Affective Computing and Intelligent Interaction (ACII)","volume":"45 1","pages":"491-497"},"PeriodicalIF":0.0000,"publicationDate":"2015-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":"{\"title\":\"Multimodal emotion recognition in response to videos (Extended abstract)\",\"authors\":\"M. Soleymani, M. Pantic, T. Pun\",\"doi\":\"10.1109/ACII.2015.7344615\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We present a user-independent emotion recognition method with the goal of detecting expected emotions or affective tags for videos using electroencephalogram (EEG), pupillary response and gaze distance. We first selected 20 video clips with extrinsic emotional content from movies and online resources. Then EEG responses and eye gaze data were recorded from 24 participants while watching emotional video clips. Ground truth was defined based on the median arousal and valence scores given to clips in a preliminary study. The arousal classes were calm, medium aroused and activated and the valence classes were unpleasant, neutral and pleasant. A one-participant-out cross validation was employed to evaluate the classification performance in a user-independent approach. The best classification accuracy of 68.5% for three labels of valence and 76.4% for three labels of arousal were obtained using a modality fusion strategy and a support vector machine. The results over a population of 24 participants demonstrate that user-independent emotion recognition can outperform individual self-reports for arousal assessments and do not underperform for valence assessments.\",\"PeriodicalId\":6863,\"journal\":{\"name\":\"2015 International Conference on Affective Computing and Intelligent Interaction (ACII)\",\"volume\":\"45 1\",\"pages\":\"491-497\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-09-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"17\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 International Conference on Affective Computing and Intelligent Interaction (ACII)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ACII.2015.7344615\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 International Conference on Affective Computing and Intelligent Interaction (ACII)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ACII.2015.7344615","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Multimodal emotion recognition in response to videos (Extended abstract)
We present a user-independent emotion recognition method with the goal of detecting expected emotions or affective tags for videos using electroencephalogram (EEG), pupillary response and gaze distance. We first selected 20 video clips with extrinsic emotional content from movies and online resources. Then EEG responses and eye gaze data were recorded from 24 participants while watching emotional video clips. Ground truth was defined based on the median arousal and valence scores given to clips in a preliminary study. The arousal classes were calm, medium aroused and activated and the valence classes were unpleasant, neutral and pleasant. A one-participant-out cross validation was employed to evaluate the classification performance in a user-independent approach. The best classification accuracy of 68.5% for three labels of valence and 76.4% for three labels of arousal were obtained using a modality fusion strategy and a support vector machine. The results over a population of 24 participants demonstrate that user-independent emotion recognition can outperform individual self-reports for arousal assessments and do not underperform for valence assessments.