{"title":"视频中的面部表情识别:基于CNN-LSTM的视频分类模型","authors":"Muhammad Abdullah, Mobeen Ahmad, Dongil Han","doi":"10.1109/ICEIC49074.2020.9051332","DOIUrl":null,"url":null,"abstract":"Facial Expressions are an integral part of human communication. Therefore, correct classification of facial expression in image and video data has been an important quest for researchers and software development industry. In this paper we propose the video classification method using Recurrent Neural Networks (RNN) in addition to Convolution Neural Networks (CNN) to capture temporal as well spatial features of a video sequence. The methodology is tested on The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS). Since no other results were available on this dataset using only visual analysis, the proposed method provides the first benchmark of 61% test accuracy on given dataset.","PeriodicalId":271345,"journal":{"name":"2020 International Conference on Electronics, Information, and Communication (ICEIC)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":"{\"title\":\"Facial Expression Recognition in Videos: An CNN-LSTM based Model for Video Classification\",\"authors\":\"Muhammad Abdullah, Mobeen Ahmad, Dongil Han\",\"doi\":\"10.1109/ICEIC49074.2020.9051332\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Facial Expressions are an integral part of human communication. Therefore, correct classification of facial expression in image and video data has been an important quest for researchers and software development industry. In this paper we propose the video classification method using Recurrent Neural Networks (RNN) in addition to Convolution Neural Networks (CNN) to capture temporal as well spatial features of a video sequence. The methodology is tested on The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS). Since no other results were available on this dataset using only visual analysis, the proposed method provides the first benchmark of 61% test accuracy on given dataset.\",\"PeriodicalId\":271345,\"journal\":{\"name\":\"2020 International Conference on Electronics, Information, and Communication (ICEIC)\",\"volume\":\"39 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"14\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 International Conference on Electronics, Information, and Communication (ICEIC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICEIC49074.2020.9051332\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference on Electronics, Information, and Communication (ICEIC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICEIC49074.2020.9051332","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Facial Expression Recognition in Videos: An CNN-LSTM based Model for Video Classification
Facial Expressions are an integral part of human communication. Therefore, correct classification of facial expression in image and video data has been an important quest for researchers and software development industry. In this paper we propose the video classification method using Recurrent Neural Networks (RNN) in addition to Convolution Neural Networks (CNN) to capture temporal as well spatial features of a video sequence. The methodology is tested on The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS). Since no other results were available on this dataset using only visual analysis, the proposed method provides the first benchmark of 61% test accuracy on given dataset.