{"title":"一种提高AlexNet面部表情识别性能的方法","authors":"Akhmad Sarif, D. Gunawan","doi":"10.1109/IAICT59002.2023.10205951","DOIUrl":null,"url":null,"abstract":"Facial Expression Recognition (FER) through digital images has undergone significant development in line with the development of computer vision technology and artificial intelligence. Facial expression recognition that has utilized deep learning shows promising results. By using deep learning, classifying millions of digital images can be easier and more accurate. However, misclassification of facial expressions sometimes still occurs. This paper proposes a method for improving the AlexNet model for application in the FER area. Some pre-processing procedures were performed on the image dataset, including resizing the image size to 227x227, converting the image to RGB (Red Blue Green) format, adjusting the contrast level of the image using CLAHE (Contrast Limited Adaptive Histogram Equalization), and augmenting by cropping the dataset image. Meanwhile, fine-tuning the AlexNet model was done by changing the ReLU activation function to Leaky ReLU, input normalization from cross channel to batch normalization, and two dropout values (from 0.5 to 0.3 and 0), and changing the number of output classifications from 1000 to 7. The experimental results show that the proposed method enhances standard AlexNet’s performance by improving its accuracy to 24.82% on the CK+ dataset and 20.05% on the KDEF dataset. There is no misclassification of facial expressions when using the proposed method, as it occurs when using the standard AlexNet model.","PeriodicalId":339796,"journal":{"name":"2023 IEEE International Conference on Industry 4.0, Artificial Intelligence, and Communications Technology (IAICT)","volume":"176 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Method for Improving AlexNet’s Performance in The Area of Facial Expressions Recognition\",\"authors\":\"Akhmad Sarif, D. Gunawan\",\"doi\":\"10.1109/IAICT59002.2023.10205951\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Facial Expression Recognition (FER) through digital images has undergone significant development in line with the development of computer vision technology and artificial intelligence. Facial expression recognition that has utilized deep learning shows promising results. By using deep learning, classifying millions of digital images can be easier and more accurate. However, misclassification of facial expressions sometimes still occurs. This paper proposes a method for improving the AlexNet model for application in the FER area. Some pre-processing procedures were performed on the image dataset, including resizing the image size to 227x227, converting the image to RGB (Red Blue Green) format, adjusting the contrast level of the image using CLAHE (Contrast Limited Adaptive Histogram Equalization), and augmenting by cropping the dataset image. Meanwhile, fine-tuning the AlexNet model was done by changing the ReLU activation function to Leaky ReLU, input normalization from cross channel to batch normalization, and two dropout values (from 0.5 to 0.3 and 0), and changing the number of output classifications from 1000 to 7. The experimental results show that the proposed method enhances standard AlexNet’s performance by improving its accuracy to 24.82% on the CK+ dataset and 20.05% on the KDEF dataset. There is no misclassification of facial expressions when using the proposed method, as it occurs when using the standard AlexNet model.\",\"PeriodicalId\":339796,\"journal\":{\"name\":\"2023 IEEE International Conference on Industry 4.0, Artificial Intelligence, and Communications Technology (IAICT)\",\"volume\":\"176 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-07-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE International Conference on Industry 4.0, Artificial Intelligence, and Communications Technology (IAICT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IAICT59002.2023.10205951\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE International Conference on Industry 4.0, Artificial Intelligence, and Communications Technology (IAICT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IAICT59002.2023.10205951","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Method for Improving AlexNet’s Performance in The Area of Facial Expressions Recognition
Facial Expression Recognition (FER) through digital images has undergone significant development in line with the development of computer vision technology and artificial intelligence. Facial expression recognition that has utilized deep learning shows promising results. By using deep learning, classifying millions of digital images can be easier and more accurate. However, misclassification of facial expressions sometimes still occurs. This paper proposes a method for improving the AlexNet model for application in the FER area. Some pre-processing procedures were performed on the image dataset, including resizing the image size to 227x227, converting the image to RGB (Red Blue Green) format, adjusting the contrast level of the image using CLAHE (Contrast Limited Adaptive Histogram Equalization), and augmenting by cropping the dataset image. Meanwhile, fine-tuning the AlexNet model was done by changing the ReLU activation function to Leaky ReLU, input normalization from cross channel to batch normalization, and two dropout values (from 0.5 to 0.3 and 0), and changing the number of output classifications from 1000 to 7. The experimental results show that the proposed method enhances standard AlexNet’s performance by improving its accuracy to 24.82% on the CK+ dataset and 20.05% on the KDEF dataset. There is no misclassification of facial expressions when using the proposed method, as it occurs when using the standard AlexNet model.