R. Parameshwara, Ibrahim Radwan, Subramanian Ramanathan, Roland Goecke
{"title":"从有限的视频数据检验主体依赖和主体独立的人类情感推断","authors":"R. Parameshwara, Ibrahim Radwan, Subramanian Ramanathan, Roland Goecke","doi":"10.1109/FG57933.2023.10042798","DOIUrl":null,"url":null,"abstract":"Continuous human affect estimation from video data entails modelling the dynamic emotional state from a sequence of facial images. Though multiple affective video databases exist, they are limited in terms of data and dy-namic annotations, as assigning continuous affective labels to video data is subjective, onerous and tedious. While studies have established the existence of signature facial expressions corresponding to the basic categorical emotions, individual differences in emoting facial expressions nevertheless exist; factoring out these idiosyncrasies is critical for effective emotion inference. This work explores continuous human affect recognition using AFEW-VA, an ‘in-the-wild’ video dataset with limited data, employing subject-independent (SI) and subject-dependent (SD) settings. The SI setting involves the use of training and test sets with mutually exclusive subjects, while training and test samples corresponding to the same subject can occur in the SD setting. A novel, dynamically-weighted loss function is employed with a Convolutional Neural Network (CNN)-Long Short- Term Memory (LSTM) architecture to optimise dynamic affect prediction. Superior prediction is achieved in the SD setting, as compared to the SI counterpart.","PeriodicalId":318766,"journal":{"name":"2023 IEEE 17th International Conference on Automatic Face and Gesture Recognition (FG)","volume":"62 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Examining Subject-Dependent and Subject-Independent Human Affect Inference from Limited Video Data\",\"authors\":\"R. Parameshwara, Ibrahim Radwan, Subramanian Ramanathan, Roland Goecke\",\"doi\":\"10.1109/FG57933.2023.10042798\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Continuous human affect estimation from video data entails modelling the dynamic emotional state from a sequence of facial images. Though multiple affective video databases exist, they are limited in terms of data and dy-namic annotations, as assigning continuous affective labels to video data is subjective, onerous and tedious. While studies have established the existence of signature facial expressions corresponding to the basic categorical emotions, individual differences in emoting facial expressions nevertheless exist; factoring out these idiosyncrasies is critical for effective emotion inference. This work explores continuous human affect recognition using AFEW-VA, an ‘in-the-wild’ video dataset with limited data, employing subject-independent (SI) and subject-dependent (SD) settings. The SI setting involves the use of training and test sets with mutually exclusive subjects, while training and test samples corresponding to the same subject can occur in the SD setting. A novel, dynamically-weighted loss function is employed with a Convolutional Neural Network (CNN)-Long Short- Term Memory (LSTM) architecture to optimise dynamic affect prediction. Superior prediction is achieved in the SD setting, as compared to the SI counterpart.\",\"PeriodicalId\":318766,\"journal\":{\"name\":\"2023 IEEE 17th International Conference on Automatic Face and Gesture Recognition (FG)\",\"volume\":\"62 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-01-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE 17th International Conference on Automatic Face and Gesture Recognition (FG)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/FG57933.2023.10042798\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE 17th International Conference on Automatic Face and Gesture Recognition (FG)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FG57933.2023.10042798","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Examining Subject-Dependent and Subject-Independent Human Affect Inference from Limited Video Data
Continuous human affect estimation from video data entails modelling the dynamic emotional state from a sequence of facial images. Though multiple affective video databases exist, they are limited in terms of data and dy-namic annotations, as assigning continuous affective labels to video data is subjective, onerous and tedious. While studies have established the existence of signature facial expressions corresponding to the basic categorical emotions, individual differences in emoting facial expressions nevertheless exist; factoring out these idiosyncrasies is critical for effective emotion inference. This work explores continuous human affect recognition using AFEW-VA, an ‘in-the-wild’ video dataset with limited data, employing subject-independent (SI) and subject-dependent (SD) settings. The SI setting involves the use of training and test sets with mutually exclusive subjects, while training and test samples corresponding to the same subject can occur in the SD setting. A novel, dynamically-weighted loss function is employed with a Convolutional Neural Network (CNN)-Long Short- Term Memory (LSTM) architecture to optimise dynamic affect prediction. Superior prediction is achieved in the SD setting, as compared to the SI counterpart.