{"title":"自发面部表情识别:基于局部的方法","authors":"N. Perveen, Dinesh Singh, C. Mohan","doi":"10.1109/ICMLA.2016.0147","DOIUrl":null,"url":null,"abstract":"A part-based approach for spontaneous expression recognition using audio-visual feature and deep convolution neural network (DCNN) is proposed. The ability of convolution neural network to handle variations in translation and scale is exploited for extracting visual features. The sub-regions, namely, eye and mouth parts extracted from the video faces are given as an input to the deep CNN (DCNN) inorder to extract convnet features. The audio features, namely, voice-report, voice intensity, and other prosodic features are used to obtain complementary information useful for classification. The confidence scores of the classifier trained on different facial parts and audio information are combined using different fusion rules for recognizing expressions. The effectiveness of the proposed approach is demonstrated on acted facial expression in wild (AFEW) dataset.","PeriodicalId":356182,"journal":{"name":"2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"44","resultStr":"{\"title\":\"Spontaneous Facial Expression Recognition: A Part Based Approach\",\"authors\":\"N. Perveen, Dinesh Singh, C. Mohan\",\"doi\":\"10.1109/ICMLA.2016.0147\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A part-based approach for spontaneous expression recognition using audio-visual feature and deep convolution neural network (DCNN) is proposed. The ability of convolution neural network to handle variations in translation and scale is exploited for extracting visual features. The sub-regions, namely, eye and mouth parts extracted from the video faces are given as an input to the deep CNN (DCNN) inorder to extract convnet features. The audio features, namely, voice-report, voice intensity, and other prosodic features are used to obtain complementary information useful for classification. The confidence scores of the classifier trained on different facial parts and audio information are combined using different fusion rules for recognizing expressions. The effectiveness of the proposed approach is demonstrated on acted facial expression in wild (AFEW) dataset.\",\"PeriodicalId\":356182,\"journal\":{\"name\":\"2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA)\",\"volume\":\"3 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"44\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICMLA.2016.0147\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA.2016.0147","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Spontaneous Facial Expression Recognition: A Part Based Approach
A part-based approach for spontaneous expression recognition using audio-visual feature and deep convolution neural network (DCNN) is proposed. The ability of convolution neural network to handle variations in translation and scale is exploited for extracting visual features. The sub-regions, namely, eye and mouth parts extracted from the video faces are given as an input to the deep CNN (DCNN) inorder to extract convnet features. The audio features, namely, voice-report, voice intensity, and other prosodic features are used to obtain complementary information useful for classification. The confidence scores of the classifier trained on different facial parts and audio information are combined using different fusion rules for recognizing expressions. The effectiveness of the proposed approach is demonstrated on acted facial expression in wild (AFEW) dataset.