利用二维变换技术和深度学习模型的基于射频传感的人类情感识别

IEEE Open Journal of the Computer Society Pub Date : 2025-06-17 DOI:10.1109/OJCS.2025.3580570

Najah AbuAli;Ihtesham Jadoon;Farman Ullah;Mohammad Hayajneh;Shayma Alkobaisi

{"title":"利用二维变换技术和深度学习模型的基于射频传感的人类情感识别","authors":"Najah AbuAli;Ihtesham Jadoon;Farman Ullah;Mohammad Hayajneh;Shayma Alkobaisi","doi":"10.1109/OJCS.2025.3580570","DOIUrl":null,"url":null,"abstract":"To meet the need for a reliable, contactless, and noninvasive sensing platform for emotion recognition, The one-dimensional (1D) time-series signal dataset obtained using the designed RFS–SDR platform is processed and transformed into two-dimensional (2D) time–frequency images (TFIs) via continuous wavelet transform (CWT), short-time Fourier transform (STFT), and wavelet coherence transform (WCT). These TFIs are then fed into pretrained deep learning models—AlexNet, ResNet18, and GoogleNet—to extract features for identifying eight emotions using varying batch sizes and optimizers. The core objective is to evaluate the effectiveness of deep learning models from transformed time-frequency features, comparing their performance with different transformation techniques and training conditions. The results show that AlexNet consistently outperforms other models, achieving superior accuracy, precision, recall, and F1 scores of up to 98%, particularly when combined with the SGDM optimizer and CWT features. ResNet18 shows superior performance, with accuracy reaching 99% when paired with Adam optimizer and CWT features; furthermore, GoogleNet exhibits high accuracy with Adam. AlexNet is robust and maintains high precision and recall across all configurations. Computational analysis reveals that AlexNet is time-efficient, particularly at large batch sizes, while GoogleNet incurs higher computational costs due to its complex architecture. The study underscores the impacts of the optimizer selection, batch size, and feature extraction methods on model performance and computational efficiency, offering valuable insights for optimizing deep learning models for RFS-based human emotion recognition.","PeriodicalId":13205,"journal":{"name":"IEEE Open Journal of the Computer Society","volume":"6 ","pages":"1178-1189"},"PeriodicalIF":0.0000,"publicationDate":"2025-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11039034","citationCount":"0","resultStr":"{\"title\":\"Radio Frequency Sensing–Based Human Emotion Identification by Leveraging 2D Transformation Techniques and Deep Learning Models\",\"authors\":\"Najah AbuAli;Ihtesham Jadoon;Farman Ullah;Mohammad Hayajneh;Shayma Alkobaisi\",\"doi\":\"10.1109/OJCS.2025.3580570\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"To meet the need for a reliable, contactless, and noninvasive sensing platform for emotion recognition, The one-dimensional (1D) time-series signal dataset obtained using the designed RFS–SDR platform is processed and transformed into two-dimensional (2D) time–frequency images (TFIs) via continuous wavelet transform (CWT), short-time Fourier transform (STFT), and wavelet coherence transform (WCT). These TFIs are then fed into pretrained deep learning models—AlexNet, ResNet18, and GoogleNet—to extract features for identifying eight emotions using varying batch sizes and optimizers. The core objective is to evaluate the effectiveness of deep learning models from transformed time-frequency features, comparing their performance with different transformation techniques and training conditions. The results show that AlexNet consistently outperforms other models, achieving superior accuracy, precision, recall, and F1 scores of up to 98%, particularly when combined with the SGDM optimizer and CWT features. ResNet18 shows superior performance, with accuracy reaching 99% when paired with Adam optimizer and CWT features; furthermore, GoogleNet exhibits high accuracy with Adam. AlexNet is robust and maintains high precision and recall across all configurations. Computational analysis reveals that AlexNet is time-efficient, particularly at large batch sizes, while GoogleNet incurs higher computational costs due to its complex architecture. The study underscores the impacts of the optimizer selection, batch size, and feature extraction methods on model performance and computational efficiency, offering valuable insights for optimizing deep learning models for RFS-based human emotion recognition.\",\"PeriodicalId\":13205,\"journal\":{\"name\":\"IEEE Open Journal of the Computer Society\",\"volume\":\"6 \",\"pages\":\"1178-1189\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-06-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11039034\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Open Journal of the Computer Society\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11039034/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Open Journal of the Computer Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/11039034/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

为了满足可靠、非接触式、无创的情感识别传感平台的需求，利用所设计的RFS-SDR平台获取的一维（1D）时间序列信号数据集，通过连续小波变换（CWT）、短时傅立叶变换（STFT）和小波相干变换（WCT）处理并转化为二维（2D）时频图像（tfi）。然后将这些tfi输入预训练的深度学习模型（alexnet、ResNet18和googlenet），使用不同的批处理大小和优化器提取特征，以识别八种情绪。核心目标是评估转换时频特征的深度学习模型的有效性，比较它们在不同转换技术和训练条件下的性能。结果表明，AlexNet始终优于其他模型，实现了更高的准确性、精密度、召回率和高达98%的F1分数，特别是在与SGDM优化器和CWT特征结合使用时。ResNet18表现出优异的性能，当与Adam优化器和CWT特征配对时，准确率达到99%；此外，使用Adam的GoogleNet显示出较高的准确性。AlexNet是强大的，并在所有配置中保持高精度和召回。计算分析表明，AlexNet具有时间效率，特别是在大规模批量处理时，而GoogleNet由于其复杂的架构而导致更高的计算成本。该研究强调了优化器选择、批处理大小和特征提取方法对模型性能和计算效率的影响，为优化基于rfs的人类情感识别的深度学习模型提供了有价值的见解。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Radio Frequency Sensing–Based Human Emotion Identification by Leveraging 2D Transformation Techniques and Deep Learning Models

To meet the need for a reliable, contactless, and noninvasive sensing platform for emotion recognition, The one-dimensional (1D) time-series signal dataset obtained using the designed RFS–SDR platform is processed and transformed into two-dimensional (2D) time–frequency images (TFIs) via continuous wavelet transform (CWT), short-time Fourier transform (STFT), and wavelet coherence transform (WCT). These TFIs are then fed into pretrained deep learning models—AlexNet, ResNet18, and GoogleNet—to extract features for identifying eight emotions using varying batch sizes and optimizers. The core objective is to evaluate the effectiveness of deep learning models from transformed time-frequency features, comparing their performance with different transformation techniques and training conditions. The results show that AlexNet consistently outperforms other models, achieving superior accuracy, precision, recall, and F1 scores of up to 98%, particularly when combined with the SGDM optimizer and CWT features. ResNet18 shows superior performance, with accuracy reaching 99% when paired with Adam optimizer and CWT features; furthermore, GoogleNet exhibits high accuracy with Adam. AlexNet is robust and maintains high precision and recall across all configurations. Computational analysis reveals that AlexNet is time-efficient, particularly at large batch sizes, while GoogleNet incurs higher computational costs due to its complex architecture. The study underscores the impacts of the optimizer selection, batch size, and feature extraction methods on model performance and computational efficiency, offering valuable insights for optimizing deep learning models for RFS-based human emotion recognition.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Open Journal of the Computer Society

CiteScore

12.60

自引率

0.00%

发文量