A Real-time skeleton-based fall detection algorithm based on temporal convolutional networks and transformer encoder

IF 3.5 3区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS
Xiaoqun Yu , Chenfeng Wang , Wenyu Wu , Shuping Xiong
{"title":"A Real-time skeleton-based fall detection algorithm based on temporal convolutional networks and transformer encoder","authors":"Xiaoqun Yu ,&nbsp;Chenfeng Wang ,&nbsp;Wenyu Wu ,&nbsp;Shuping Xiong","doi":"10.1016/j.pmcj.2025.102016","DOIUrl":null,"url":null,"abstract":"<div><div>As the population of older individuals living independently rises, coupled with the heightened risk of falls among this demographic, the need for automatic fall detection systems becomes increasingly urgent to ensure timely medical intervention. Computer vision (CV)-based methodologies have emerged as a preferred approach among researchers due to their contactless and pervasive nature. However, existing CV-based solutions often suffer from either poor robustness or prohibitively high computational requirements, impeding their practical implementation in elderly living environments. To address these challenges, we introduce TCNTE, a real-time skeleton-based fall detection algorithm that combines Temporal Convolutional Network (TCN) with Transformer Encoder (TE). We also successfully mitigate the severe class imbalance issue by implementing weighted focal loss. Cross-validation on multiple publicly available vision-based fall datasets demonstrates TCNTE's superiority over individual models (TCN and TE) and existing state-of-the-art fall detection algorithms, achieving remarkable accuracies (front view of UP-Fall: 99.58 %; side view of UP-Fall: 98.75 %; Le2i: 97.01 %; GMDCSA-24: 92.99 %) alongside practical viability. Visualizations using t-distributed stochastic neighbor embedding (t-SNE) reveal TCNTE's superior separation margin and cohesive clustering between fall and non-fall classes compared to TCN and TE. Crucially, TCNTE is designed for pervasive deployment in mobile and resource-constrained environments. Integrated with YOLOv8 pose estimation and BoT-SORT human tracking, the algorithm operates on NVIDIA Jetson Orin NX edge device, achieving an average frame rate of 19 fps for single-person and 17 fps for two-person scenarios. With its validated accuracy and impressive real-time performance, TCNTE holds significant promise for practical fall detection applications in older adult care settings.</div></div>","PeriodicalId":49005,"journal":{"name":"Pervasive and Mobile Computing","volume":"107 ","pages":"Article 102016"},"PeriodicalIF":3.5000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pervasive and Mobile Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1574119225000057","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

As the population of older individuals living independently rises, coupled with the heightened risk of falls among this demographic, the need for automatic fall detection systems becomes increasingly urgent to ensure timely medical intervention. Computer vision (CV)-based methodologies have emerged as a preferred approach among researchers due to their contactless and pervasive nature. However, existing CV-based solutions often suffer from either poor robustness or prohibitively high computational requirements, impeding their practical implementation in elderly living environments. To address these challenges, we introduce TCNTE, a real-time skeleton-based fall detection algorithm that combines Temporal Convolutional Network (TCN) with Transformer Encoder (TE). We also successfully mitigate the severe class imbalance issue by implementing weighted focal loss. Cross-validation on multiple publicly available vision-based fall datasets demonstrates TCNTE's superiority over individual models (TCN and TE) and existing state-of-the-art fall detection algorithms, achieving remarkable accuracies (front view of UP-Fall: 99.58 %; side view of UP-Fall: 98.75 %; Le2i: 97.01 %; GMDCSA-24: 92.99 %) alongside practical viability. Visualizations using t-distributed stochastic neighbor embedding (t-SNE) reveal TCNTE's superior separation margin and cohesive clustering between fall and non-fall classes compared to TCN and TE. Crucially, TCNTE is designed for pervasive deployment in mobile and resource-constrained environments. Integrated with YOLOv8 pose estimation and BoT-SORT human tracking, the algorithm operates on NVIDIA Jetson Orin NX edge device, achieving an average frame rate of 19 fps for single-person and 17 fps for two-person scenarios. With its validated accuracy and impressive real-time performance, TCNTE holds significant promise for practical fall detection applications in older adult care settings.
一种基于时间卷积网络和变压器编码器的基于骨骼的实时跌倒检测算法
随着独立生活的老年人人口的增加,加上这一人口中跌倒风险的增加,对自动跌倒检测系统的需求变得越来越迫切,以确保及时的医疗干预。基于计算机视觉(CV)的方法由于其非接触式和普及性而成为研究人员的首选方法。然而,现有的基于cv的解决方案往往存在鲁棒性差或计算需求过高的问题,阻碍了它们在老年人生活环境中的实际实现。为了解决这些挑战,我们引入了TCNTE,这是一种结合了时间卷积网络(TCN)和变压器编码器(TE)的基于骨骼的实时跌倒检测算法。我们还成功地缓解了严重的类不平衡问题,通过实现加权焦点损失。在多个公开可用的基于视觉的跌倒数据集上的交叉验证表明,TCNTE优于单个模型(TCN和TE)和现有的最先进的跌倒检测算法,实现了显着的准确性(UP-Fall的前视图:99.58%;上下侧视图:98.75%;Le2i: 97.01%;gmdsa -24: 92.99%)和实际可行性。使用t分布随机邻居嵌入(t-SNE)的可视化显示,与TCN和TE相比,TCNTE在跌倒类和非跌倒类之间具有更好的分离裕度和内聚性。至关重要的是,TCNTE是为移动和资源受限环境中的普遍部署而设计的。该算法集成了YOLOv8姿态估计和BoT-SORT人体跟踪,在NVIDIA Jetson Orin NX边缘设备上运行,单人场景平均帧率为19 fps,两人场景平均帧率为17 fps。凭借其经过验证的准确性和令人印象深刻的实时性能,TCNTE在老年人护理环境中的实际跌倒检测应用中具有重要的前景。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Pervasive and Mobile Computing
Pervasive and Mobile Computing COMPUTER SCIENCE, INFORMATION SYSTEMS-TELECOMMUNICATIONS
CiteScore
7.70
自引率
2.30%
发文量
80
审稿时长
68 days
期刊介绍: As envisioned by Mark Weiser as early as 1991, pervasive computing systems and services have truly become integral parts of our daily lives. Tremendous developments in a multitude of technologies ranging from personalized and embedded smart devices (e.g., smartphones, sensors, wearables, IoTs, etc.) to ubiquitous connectivity, via a variety of wireless mobile communications and cognitive networking infrastructures, to advanced computing techniques (including edge, fog and cloud) and user-friendly middleware services and platforms have significantly contributed to the unprecedented advances in pervasive and mobile computing. Cutting-edge applications and paradigms have evolved, such as cyber-physical systems and smart environments (e.g., smart city, smart energy, smart transportation, smart healthcare, etc.) that also involve human in the loop through social interactions and participatory and/or mobile crowd sensing, for example. The goal of pervasive computing systems is to improve human experience and quality of life, without explicit awareness of the underlying communications and computing technologies. The Pervasive and Mobile Computing Journal (PMC) is a high-impact, peer-reviewed technical journal that publishes high-quality scientific articles spanning theory and practice, and covering all aspects of pervasive and mobile computing and systems.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信