{"title":"Radio Frequency Sensing–Based Human Emotion Identification by Leveraging 2D Transformation Techniques and Deep Learning Models","authors":"Najah AbuAli;Ihtesham Jadoon;Farman Ullah;Mohammad Hayajneh;Shayma Alkobaisi","doi":"10.1109/OJCS.2025.3580570","DOIUrl":null,"url":null,"abstract":"To meet the need for a reliable, contactless, and noninvasive sensing platform for emotion recognition, The one-dimensional (1D) time-series signal dataset obtained using the designed RFS–SDR platform is processed and transformed into two-dimensional (2D) time–frequency images (TFIs) via continuous wavelet transform (CWT), short-time Fourier transform (STFT), and wavelet coherence transform (WCT). These TFIs are then fed into pretrained deep learning models—AlexNet, ResNet18, and GoogleNet—to extract features for identifying eight emotions using varying batch sizes and optimizers. The core objective is to evaluate the effectiveness of deep learning models from transformed time-frequency features, comparing their performance with different transformation techniques and training conditions. The results show that AlexNet consistently outperforms other models, achieving superior accuracy, precision, recall, and F1 scores of up to 98%, particularly when combined with the SGDM optimizer and CWT features. ResNet18 shows superior performance, with accuracy reaching 99% when paired with Adam optimizer and CWT features; furthermore, GoogleNet exhibits high accuracy with Adam. AlexNet is robust and maintains high precision and recall across all configurations. Computational analysis reveals that AlexNet is time-efficient, particularly at large batch sizes, while GoogleNet incurs higher computational costs due to its complex architecture. The study underscores the impacts of the optimizer selection, batch size, and feature extraction methods on model performance and computational efficiency, offering valuable insights for optimizing deep learning models for RFS-based human emotion recognition.","PeriodicalId":13205,"journal":{"name":"IEEE Open Journal of the Computer Society","volume":"6 ","pages":"1178-1189"},"PeriodicalIF":0.0000,"publicationDate":"2025-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11039034","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Open Journal of the Computer Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/11039034/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
To meet the need for a reliable, contactless, and noninvasive sensing platform for emotion recognition, The one-dimensional (1D) time-series signal dataset obtained using the designed RFS–SDR platform is processed and transformed into two-dimensional (2D) time–frequency images (TFIs) via continuous wavelet transform (CWT), short-time Fourier transform (STFT), and wavelet coherence transform (WCT). These TFIs are then fed into pretrained deep learning models—AlexNet, ResNet18, and GoogleNet—to extract features for identifying eight emotions using varying batch sizes and optimizers. The core objective is to evaluate the effectiveness of deep learning models from transformed time-frequency features, comparing their performance with different transformation techniques and training conditions. The results show that AlexNet consistently outperforms other models, achieving superior accuracy, precision, recall, and F1 scores of up to 98%, particularly when combined with the SGDM optimizer and CWT features. ResNet18 shows superior performance, with accuracy reaching 99% when paired with Adam optimizer and CWT features; furthermore, GoogleNet exhibits high accuracy with Adam. AlexNet is robust and maintains high precision and recall across all configurations. Computational analysis reveals that AlexNet is time-efficient, particularly at large batch sizes, while GoogleNet incurs higher computational costs due to its complex architecture. The study underscores the impacts of the optimizer selection, batch size, and feature extraction methods on model performance and computational efficiency, offering valuable insights for optimizing deep learning models for RFS-based human emotion recognition.