Syed Muhammad Salman Bukhari , Muhammad Hamza Zafar , Syed Kumayl Raza Moosavi , Filippo Sanfilippo
{"title":"Emotion recognition with a Randomized CNN-multihead-attention hybrid model optimized by evolutionary intelligence algorithm","authors":"Syed Muhammad Salman Bukhari , Muhammad Hamza Zafar , Syed Kumayl Raza Moosavi , Filippo Sanfilippo","doi":"10.1016/j.array.2025.100401","DOIUrl":null,"url":null,"abstract":"<div><div>Emotion recognition systems are vital for various applications, yet existing models often face limitations in computational efficiency and accuracy, especially when handling complex emotional expressions in sequential data. To address these challenges, we propose an innovative emotion recognition framework that integrates a Randomised Convolutional Neural Network (RCNN) with a Multi-Head Attention model, further optimized by the Football Team Training Algorithm (FTTA) metaheuristic to enhance network parameters effectively. The RCNN, characterized by fixed random weights in its convolutional layers, efficiently extracts features from facial landmarks, enabling robust and diverse feature extraction while reducing computational load. This structure is complemented by a multi-head attention mechanism that processes temporal dynamics in emotion data, with both components optimized through FTTA to balance exploration and exploitation. Our hybrid model undergoes rigorous testing on a widely recognized emotion recognition dataset, outperforming conventional fully trainable models and alternative architectures. The results indicate a substantial improvement in classification accuracy, with an overall accuracy of 99%, and a significant reduction in computational demands, achieving a 65% faster training time on average compared to state-of-the-art models. These enhancements confirm the model’s efficiency and robustness across various emotional classifications. The synergy between the RCNN’s fixed-weight feature extraction and FTTA’s optimization capabilities demonstrates a powerful solution for emotion recognition systems. The combination of accuracy and efficiency renders our model suitable for real-world applications, particularly in fields like healthcare and mental health monitoring, where real-time emotion detection can have significant impacts.</div></div>","PeriodicalId":8417,"journal":{"name":"Array","volume":"26 ","pages":"Article 100401"},"PeriodicalIF":2.3000,"publicationDate":"2025-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Array","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2590005625000281","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Emotion recognition systems are vital for various applications, yet existing models often face limitations in computational efficiency and accuracy, especially when handling complex emotional expressions in sequential data. To address these challenges, we propose an innovative emotion recognition framework that integrates a Randomised Convolutional Neural Network (RCNN) with a Multi-Head Attention model, further optimized by the Football Team Training Algorithm (FTTA) metaheuristic to enhance network parameters effectively. The RCNN, characterized by fixed random weights in its convolutional layers, efficiently extracts features from facial landmarks, enabling robust and diverse feature extraction while reducing computational load. This structure is complemented by a multi-head attention mechanism that processes temporal dynamics in emotion data, with both components optimized through FTTA to balance exploration and exploitation. Our hybrid model undergoes rigorous testing on a widely recognized emotion recognition dataset, outperforming conventional fully trainable models and alternative architectures. The results indicate a substantial improvement in classification accuracy, with an overall accuracy of 99%, and a significant reduction in computational demands, achieving a 65% faster training time on average compared to state-of-the-art models. These enhancements confirm the model’s efficiency and robustness across various emotional classifications. The synergy between the RCNN’s fixed-weight feature extraction and FTTA’s optimization capabilities demonstrates a powerful solution for emotion recognition systems. The combination of accuracy and efficiency renders our model suitable for real-world applications, particularly in fields like healthcare and mental health monitoring, where real-time emotion detection can have significant impacts.