{"title":"基于多数据集的深度学习模型面部表情识别","authors":"Takashi Kuremoto, Yuya Mori, Shingo Mabu","doi":"10.1002/ecj.12484","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>Facial Expression Recognition has been studied for many years; however, it remains a challenging task in real-world environments due to complex backgrounds, varying illumination conditions, and online processing issues. In this study, we propose a deep learning model, CAER-Net-RS, by leveraging multiple training datasets. The proposed model integrates three neural networks: the Face Network, the Context Network, and the Adaptive Network. Different datasets are employed for the pretraining of these networks: the facial expression image dataset RAF-DB for the Face Network, the scene image dataset Places365-Standard for the Context Network, and the CAER-S dataset for the Adaptive Network. In the experiment, the proposed model achieved an average recognition accuracy of 85.20% across seven types of facial expressions, compared to 70.92% for the conventional Context-Aware Emotion Recognition Network (CAER-Net).</p>\n </div>","PeriodicalId":50539,"journal":{"name":"Electronics and Communications in Japan","volume":"108 2","pages":""},"PeriodicalIF":0.4000,"publicationDate":"2025-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Facial Expression Recognition by Deep Learning Models Using Multiple Datasets\",\"authors\":\"Takashi Kuremoto, Yuya Mori, Shingo Mabu\",\"doi\":\"10.1002/ecj.12484\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n <p>Facial Expression Recognition has been studied for many years; however, it remains a challenging task in real-world environments due to complex backgrounds, varying illumination conditions, and online processing issues. In this study, we propose a deep learning model, CAER-Net-RS, by leveraging multiple training datasets. The proposed model integrates three neural networks: the Face Network, the Context Network, and the Adaptive Network. Different datasets are employed for the pretraining of these networks: the facial expression image dataset RAF-DB for the Face Network, the scene image dataset Places365-Standard for the Context Network, and the CAER-S dataset for the Adaptive Network. In the experiment, the proposed model achieved an average recognition accuracy of 85.20% across seven types of facial expressions, compared to 70.92% for the conventional Context-Aware Emotion Recognition Network (CAER-Net).</p>\\n </div>\",\"PeriodicalId\":50539,\"journal\":{\"name\":\"Electronics and Communications in Japan\",\"volume\":\"108 2\",\"pages\":\"\"},\"PeriodicalIF\":0.4000,\"publicationDate\":\"2025-04-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Electronics and Communications in Japan\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/ecj.12484\",\"RegionNum\":4,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Electronics and Communications in Japan","FirstCategoryId":"5","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/ecj.12484","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
Facial Expression Recognition by Deep Learning Models Using Multiple Datasets
Facial Expression Recognition has been studied for many years; however, it remains a challenging task in real-world environments due to complex backgrounds, varying illumination conditions, and online processing issues. In this study, we propose a deep learning model, CAER-Net-RS, by leveraging multiple training datasets. The proposed model integrates three neural networks: the Face Network, the Context Network, and the Adaptive Network. Different datasets are employed for the pretraining of these networks: the facial expression image dataset RAF-DB for the Face Network, the scene image dataset Places365-Standard for the Context Network, and the CAER-S dataset for the Adaptive Network. In the experiment, the proposed model achieved an average recognition accuracy of 85.20% across seven types of facial expressions, compared to 70.92% for the conventional Context-Aware Emotion Recognition Network (CAER-Net).
期刊介绍:
Electronics and Communications in Japan (ECJ) publishes papers translated from the Transactions of the Institute of Electrical Engineers of Japan 12 times per year as an official journal of the Institute of Electrical Engineers of Japan (IEEJ). ECJ aims to provide world-class researches in highly diverse and sophisticated areas of Electrical and Electronic Engineering as well as in related disciplines with emphasis on electronic circuits, controls and communications. ECJ focuses on the following fields:
- Electronic theory and circuits,
- Control theory,
- Communications,
- Cryptography,
- Biomedical fields,
- Surveillance,
- Robotics,
- Sensors and actuators,
- Micromachines,
- Image analysis and signal analysis,
- New materials.
For works related to the science, technology, and applications of electric power, please refer to the sister journal Electrical Engineering in Japan (EEJ).