基于HAPNet分割和混合VGG16-SVM分类器的红外图像面部表情识别

IF 0.8 Q4 OPTICS

Optical Memory and Neural Networks Pub Date : 2025-07-02 DOI:10.3103/S1060992X24600599

Rupali J. Dhabarde, D. V. Kodavade, Aditya Konnur, Vijay Manwatkar

{"title":"基于HAPNet分割和混合VGG16-SVM分类器的红外图像面部表情识别","authors":"Rupali J. Dhabarde, D. V. Kodavade, Aditya Konnur, Vijay Manwatkar","doi":"10.3103/S1060992X24600599","DOIUrl":null,"url":null,"abstract":"<p>Recognition of Human Face expression is the most significant and challenging societal interaction tasks. Humans often convey their feelings and intentions through their facial expressions in a natural and honest manner, nonverbal communication is mostly characterized by facial expressions. Various approaches for classifying emotions and facial recognition have been established to enhance the accuracy of face recognition in the infrared images. Significant issues of recent deep FER systems include overfitting due to insufficient training data as well as expression-unrelated variables such as identification bias, head posture, and illumination. To address these challenges, the proposed model implemented a method for detecting facial expression using HAPNet segmentation and hybrid VGG16 with SVM classifier. At first, pre-processed the images using an optimized Difference of Gaussians (DOG) filter for enhancing the edges of the image and the Artificial Gorilla Troops Optimization Algorithm (GTO) is used to select the kernel size based on the maximum PSNR. Segmentation is the next step for segmenting the face using the Hybrid, Asymmetric, and Progressive Network (HAPNet) method. Landmark is detected based on Multi-Task Cascaded Convolutional Networks (MTCNN) for identifying the location of the mouth eyes, and nose. The last step is to categorize the seven emotions which are happy, sad, disgusted, surprised, angry, fearful, and neutral on faces using the hybrid VGG16 with Support Vector Machine (SVM) algorithm. The effectiveness of the proposed methodology is evaluated using the metrics of accuracy is 96.6%, positive predictive value is 93.08%, hit rate of 95.2%, selectivity of 92.5%, negative predictive value of 95.8%, and f1-score of 94.49%. Experiments on the database illustrates that the proposed approach performs better than conventional techniques for accurately identifies the expressions on the face in the thermal images.</p>","PeriodicalId":721,"journal":{"name":"Optical Memory and Neural Networks","volume":"34 2","pages":"146 - 163"},"PeriodicalIF":0.8000,"publicationDate":"2025-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Facial Expression Recognition in Infrared Imaging Using HAPNet Segmentation and Hybrid VGG16-SVM Classifier\",\"authors\":\"Rupali J. Dhabarde, D. V. Kodavade, Aditya Konnur, Vijay Manwatkar\",\"doi\":\"10.3103/S1060992X24600599\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Recognition of Human Face expression is the most significant and challenging societal interaction tasks. Humans often convey their feelings and intentions through their facial expressions in a natural and honest manner, nonverbal communication is mostly characterized by facial expressions. Various approaches for classifying emotions and facial recognition have been established to enhance the accuracy of face recognition in the infrared images. Significant issues of recent deep FER systems include overfitting due to insufficient training data as well as expression-unrelated variables such as identification bias, head posture, and illumination. To address these challenges, the proposed model implemented a method for detecting facial expression using HAPNet segmentation and hybrid VGG16 with SVM classifier. At first, pre-processed the images using an optimized Difference of Gaussians (DOG) filter for enhancing the edges of the image and the Artificial Gorilla Troops Optimization Algorithm (GTO) is used to select the kernel size based on the maximum PSNR. Segmentation is the next step for segmenting the face using the Hybrid, Asymmetric, and Progressive Network (HAPNet) method. Landmark is detected based on Multi-Task Cascaded Convolutional Networks (MTCNN) for identifying the location of the mouth eyes, and nose. The last step is to categorize the seven emotions which are happy, sad, disgusted, surprised, angry, fearful, and neutral on faces using the hybrid VGG16 with Support Vector Machine (SVM) algorithm. The effectiveness of the proposed methodology is evaluated using the metrics of accuracy is 96.6%, positive predictive value is 93.08%, hit rate of 95.2%, selectivity of 92.5%, negative predictive value of 95.8%, and f1-score of 94.49%. Experiments on the database illustrates that the proposed approach performs better than conventional techniques for accurately identifies the expressions on the face in the thermal images.</p>\",\"PeriodicalId\":721,\"journal\":{\"name\":\"Optical Memory and Neural Networks\",\"volume\":\"34 2\",\"pages\":\"146 - 163\"},\"PeriodicalIF\":0.8000,\"publicationDate\":\"2025-07-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Optical Memory and Neural Networks\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://link.springer.com/article/10.3103/S1060992X24600599\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"OPTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Optical Memory and Neural Networks","FirstCategoryId":"1085","ListUrlMain":"https://link.springer.com/article/10.3103/S1060992X24600599","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"OPTICS","Score":null,"Total":0}

引用次数: 0

摘要

人脸表情识别是人类社会交往中最重要、最具挑战性的任务。人类通常通过面部表情以自然和诚实的方式表达自己的感受和意图，非语言交际主要以面部表情为特征。为了提高红外图像中人脸识别的准确性，人们建立了各种情绪分类和人脸识别方法。最近深度深度神经网络系统的重要问题包括由于训练数据不足而导致的过拟合，以及与表达无关的变量，如识别偏差、头部姿势和照明。为了解决这些挑战，该模型实现了一种基于HAPNet分割和混合VGG16与SVM分类器的面部表情检测方法。首先，使用优化的差分高斯滤波（DOG）对图像进行预处理，增强图像的边缘，并使用人工大猩猩优化算法（GTO）根据最大PSNR选择核大小。分割是使用混合，不对称和渐进网络（HAPNet）方法分割人脸的下一步。基于多任务级联卷积网络（Multi-Task cascade Convolutional Networks， MTCNN）检测地标，用于识别嘴巴、眼睛和鼻子的位置。最后一步是使用混合VGG16和支持向量机（SVM）算法对面部的快乐、悲伤、厌恶、惊讶、愤怒、恐惧和中性七种情绪进行分类。准确度为96.6%，阳性预测值为93.08%，命中率为95.2%，选择性为92.5%，阴性预测值为95.8%，f1得分为94.49%，对所提方法的有效性进行了评价。在数据库上的实验表明，该方法在准确识别热图像中的面部表情方面优于传统技术。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Facial Expression Recognition in Infrared Imaging Using HAPNet Segmentation and Hybrid VGG16-SVM Classifier

查看原文本刊更多论文

Facial Expression Recognition in Infrared Imaging Using HAPNet Segmentation and Hybrid VGG16-SVM Classifier

Recognition of Human Face expression is the most significant and challenging societal interaction tasks. Humans often convey their feelings and intentions through their facial expressions in a natural and honest manner, nonverbal communication is mostly characterized by facial expressions. Various approaches for classifying emotions and facial recognition have been established to enhance the accuracy of face recognition in the infrared images. Significant issues of recent deep FER systems include overfitting due to insufficient training data as well as expression-unrelated variables such as identification bias, head posture, and illumination. To address these challenges, the proposed model implemented a method for detecting facial expression using HAPNet segmentation and hybrid VGG16 with SVM classifier. At first, pre-processed the images using an optimized Difference of Gaussians (DOG) filter for enhancing the edges of the image and the Artificial Gorilla Troops Optimization Algorithm (GTO) is used to select the kernel size based on the maximum PSNR. Segmentation is the next step for segmenting the face using the Hybrid, Asymmetric, and Progressive Network (HAPNet) method. Landmark is detected based on Multi-Task Cascaded Convolutional Networks (MTCNN) for identifying the location of the mouth eyes, and nose. The last step is to categorize the seven emotions which are happy, sad, disgusted, surprised, angry, fearful, and neutral on faces using the hybrid VGG16 with Support Vector Machine (SVM) algorithm. The effectiveness of the proposed methodology is evaluated using the metrics of accuracy is 96.6%, positive predictive value is 93.08%, hit rate of 95.2%, selectivity of 92.5%, negative predictive value of 95.8%, and f1-score of 94.49%. Experiments on the database illustrates that the proposed approach performs better than conventional techniques for accurately identifies the expressions on the face in the thermal images.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Optical Memory and Neural Networks OPTICS-

CiteScore

1.50

自引率

11.10%

发文量

期刊介绍： The journal covers a wide range of issues in information optics such as optical memory, mechanisms for optical data recording and processing, photosensitive materials, optical, optoelectronic and holographic nanostructures, and many other related topics. Papers on memory systems using holographic and biological structures and concepts of brain operation are also included. The journal pays particular attention to research in the field of neural net systems that may lead to a new generation of computional technologies by endowing them with intelligence.