{"title":"基于wi - fi的人体跌倒和活动识别,使用基于变压器的编码器-解码器和图神经网络","authors":"Younggeol Cho;Elisa Motta;Olivia Nocentini;Marta Lagomarsino;Andrea Merello;Marco Crepaldi;Arash Ajoudani","doi":"10.1109/JSEN.2025.3593126","DOIUrl":null,"url":null,"abstract":"Human pose estimation and action recognition have received attention due to their critical roles in healthcare monitoring, rehabilitation, and assistive technologies. In this study, we proposed a novel architecture named Transformer-based encoder–decoder network (TED-Net) designed for estimating human skeleton poses from Wi-Fi channel state information (CSI). The TED-Net integrates convolutional encoders with Transformer-based attention mechanisms to capture spatiotemporal features from CSI signals. The estimated skeleton poses were used as input to a customized directed graph neural network (DGNN) for action recognition. We validated our model on two datasets: a publicly available multimodal dataset for assessing general pose estimation, and a newly collected dataset focused on fall-related scenarios involving 20 participants. Experimental results demonstrated that the TED-Net outperformed existing approaches in pose estimation and that the DGNN achieves reliable action classification using CSI-based skeletons, with performance comparable to RGB-based systems. Notably, the TED-Net maintains robust performance across both fall and nonfall cases. These findings highlight the potential of CSI-driven human skeleton estimation for effective action recognition, particularly in home environments such as elderly fall detection. In such settings, Wi-Fi signals are often readily available, offering a privacy-preserving alternative to vision-based methods, which may raise concerns about continuous camera monitoring.","PeriodicalId":447,"journal":{"name":"IEEE Sensors Journal","volume":"25 18","pages":"34939-34947"},"PeriodicalIF":4.3000,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Wi-Fi-Based Human Fall and Activity Recognition Using Transformer-Based Encoder–Decoder and Graph Neural Networks\",\"authors\":\"Younggeol Cho;Elisa Motta;Olivia Nocentini;Marta Lagomarsino;Andrea Merello;Marco Crepaldi;Arash Ajoudani\",\"doi\":\"10.1109/JSEN.2025.3593126\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Human pose estimation and action recognition have received attention due to their critical roles in healthcare monitoring, rehabilitation, and assistive technologies. In this study, we proposed a novel architecture named Transformer-based encoder–decoder network (TED-Net) designed for estimating human skeleton poses from Wi-Fi channel state information (CSI). The TED-Net integrates convolutional encoders with Transformer-based attention mechanisms to capture spatiotemporal features from CSI signals. The estimated skeleton poses were used as input to a customized directed graph neural network (DGNN) for action recognition. We validated our model on two datasets: a publicly available multimodal dataset for assessing general pose estimation, and a newly collected dataset focused on fall-related scenarios involving 20 participants. Experimental results demonstrated that the TED-Net outperformed existing approaches in pose estimation and that the DGNN achieves reliable action classification using CSI-based skeletons, with performance comparable to RGB-based systems. Notably, the TED-Net maintains robust performance across both fall and nonfall cases. These findings highlight the potential of CSI-driven human skeleton estimation for effective action recognition, particularly in home environments such as elderly fall detection. In such settings, Wi-Fi signals are often readily available, offering a privacy-preserving alternative to vision-based methods, which may raise concerns about continuous camera monitoring.\",\"PeriodicalId\":447,\"journal\":{\"name\":\"IEEE Sensors Journal\",\"volume\":\"25 18\",\"pages\":\"34939-34947\"},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2025-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Sensors Journal\",\"FirstCategoryId\":\"103\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11107310/\",\"RegionNum\":2,\"RegionCategory\":\"综合性期刊\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Sensors Journal","FirstCategoryId":"103","ListUrlMain":"https://ieeexplore.ieee.org/document/11107310/","RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
Wi-Fi-Based Human Fall and Activity Recognition Using Transformer-Based Encoder–Decoder and Graph Neural Networks
Human pose estimation and action recognition have received attention due to their critical roles in healthcare monitoring, rehabilitation, and assistive technologies. In this study, we proposed a novel architecture named Transformer-based encoder–decoder network (TED-Net) designed for estimating human skeleton poses from Wi-Fi channel state information (CSI). The TED-Net integrates convolutional encoders with Transformer-based attention mechanisms to capture spatiotemporal features from CSI signals. The estimated skeleton poses were used as input to a customized directed graph neural network (DGNN) for action recognition. We validated our model on two datasets: a publicly available multimodal dataset for assessing general pose estimation, and a newly collected dataset focused on fall-related scenarios involving 20 participants. Experimental results demonstrated that the TED-Net outperformed existing approaches in pose estimation and that the DGNN achieves reliable action classification using CSI-based skeletons, with performance comparable to RGB-based systems. Notably, the TED-Net maintains robust performance across both fall and nonfall cases. These findings highlight the potential of CSI-driven human skeleton estimation for effective action recognition, particularly in home environments such as elderly fall detection. In such settings, Wi-Fi signals are often readily available, offering a privacy-preserving alternative to vision-based methods, which may raise concerns about continuous camera monitoring.
期刊介绍:
The fields of interest of the IEEE Sensors Journal are the theory, design , fabrication, manufacturing and applications of devices for sensing and transducing physical, chemical and biological phenomena, with emphasis on the electronics and physics aspect of sensors and integrated sensors-actuators. IEEE Sensors Journal deals with the following:
-Sensor Phenomenology, Modelling, and Evaluation
-Sensor Materials, Processing, and Fabrication
-Chemical and Gas Sensors
-Microfluidics and Biosensors
-Optical Sensors
-Physical Sensors: Temperature, Mechanical, Magnetic, and others
-Acoustic and Ultrasonic Sensors
-Sensor Packaging
-Sensor Networks
-Sensor Applications
-Sensor Systems: Signals, Processing, and Interfaces
-Actuators and Sensor Power Systems
-Sensor Signal Processing for high precision and stability (amplification, filtering, linearization, modulation/demodulation) and under harsh conditions (EMC, radiation, humidity, temperature); energy consumption/harvesting
-Sensor Data Processing (soft computing with sensor data, e.g., pattern recognition, machine learning, evolutionary computation; sensor data fusion, processing of wave e.g., electromagnetic and acoustic; and non-wave, e.g., chemical, gravity, particle, thermal, radiative and non-radiative sensor data, detection, estimation and classification based on sensor data)
-Sensors in Industrial Practice