Facial expression recognition using visible and IR by early fusion of deep learning with attention mechanism.

IF 3.5 4区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

PeerJ Computer Science Pub Date : 2025-03-12 eCollection Date: 2025-01-01 DOI:10.7717/peerj-cs.2676

Muhammad Tahir Naseem, Chan-Su Lee, Tariq Shahzad, Muhammad Adnan Khan, Adnan M Abu-Mahfouz, Khmaies Ouahada

{"title":"Facial expression recognition using visible and IR by early fusion of deep learning with attention mechanism.","authors":"Muhammad Tahir Naseem, Chan-Su Lee, Tariq Shahzad, Muhammad Adnan Khan, Adnan M Abu-Mahfouz, Khmaies Ouahada","doi":"10.7717/peerj-cs.2676","DOIUrl":null,"url":null,"abstract":"<p><p>Facial expression recognition (FER) has garnered significant attention due to advances in artificial intelligence, particularly in applications like driver monitoring, healthcare, and human-computer interaction, which benefit from deep learning techniques. The motivation of this research is to address the challenges of accurately recognizing emotions despite variations in expressions across emotions and similarities between different expressions. In this work, we propose an early fusion approach that combines features from visible and infrared modalities using publicly accessible VIRI and NVIE databases. Initially, we developed single-modality models for visible and infrared datasets by incorporating an attention mechanism into the ResNet-18 architecture. We then extended this to a multi-modal early fusion approach using the same modified ResNet-18 with attention, achieving superior accuracy through the combination of convolutional neural network (CNN) and transfer learning (TL). Our multi-modal approach attained 84.44% accuracy on the VIRI database and 85.20% on the natural visible and infrared facial expression (NVIE) database, outperforming previous methods. These results demonstrate that our single-modal and multi-modal approaches achieve state-of-the-art performance in FER.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"11 ","pages":"e2676"},"PeriodicalIF":3.5000,"publicationDate":"2025-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11935750/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PeerJ Computer Science","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.7717/peerj-cs.2676","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Facial expression recognition (FER) has garnered significant attention due to advances in artificial intelligence, particularly in applications like driver monitoring, healthcare, and human-computer interaction, which benefit from deep learning techniques. The motivation of this research is to address the challenges of accurately recognizing emotions despite variations in expressions across emotions and similarities between different expressions. In this work, we propose an early fusion approach that combines features from visible and infrared modalities using publicly accessible VIRI and NVIE databases. Initially, we developed single-modality models for visible and infrared datasets by incorporating an attention mechanism into the ResNet-18 architecture. We then extended this to a multi-modal early fusion approach using the same modified ResNet-18 with attention, achieving superior accuracy through the combination of convolutional neural network (CNN) and transfer learning (TL). Our multi-modal approach attained 84.44% accuracy on the VIRI database and 85.20% on the natural visible and infrared facial expression (NVIE) database, outperforming previous methods. These results demonstrate that our single-modal and multi-modal approaches achieve state-of-the-art performance in FER.

查看原文本刊更多论文

求助全文

约1分钟内获得全文求助全文

来源期刊

PeerJ Computer Science Computer Science-General Computer Science

CiteScore

6.10

自引率

5.30%

发文量

332

审稿时长

10 weeks

期刊介绍： PeerJ Computer Science is the new open access journal covering all subject areas in computer science, with the backing of a prestigious advisory board and more than 300 academic editors.