EgoFormer:自动驾驶背景下的自我姿态分类

IF 4.3 2区 综合性期刊 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC
Tayeba Qazi;M. Rupesh Kumar;Prerana Mukherjee;Brejesh Lall
{"title":"EgoFormer:自动驾驶背景下的自我姿态分类","authors":"Tayeba Qazi;M. Rupesh Kumar;Prerana Mukherjee;Brejesh Lall","doi":"10.1109/JSEN.2024.3390794","DOIUrl":null,"url":null,"abstract":"Decoding the intentions of passengers and other road users remains a critical challenge for autonomous vehicles (AVs) and intelligent transportation systems. Hand gestures are key in these interactions, offering a direct communication channel. Moreover, egocentric videos mimic a first-person perspective, aligning closely with human visual perception. Yet, the development of deep learning algorithms for detecting egocentric hand gestures in autonomous driving is hindered by the absence of useful datasets. Furthermore, there is a pressing need for gesture recognition methods to evolve from convolutional neural network (CNN)-based architectures to transformer models. To address these challenges, we present EgoDriving, a novel dataset of egocentric hand gestures, curated for driving-related hand gestures. Finally, we introduce EgoFormer, an efficient video transformer for egocentric hand gesture classification that is optimized for edge-computing deployments. EgoFormer incorporates a video dynamic position bias (VDPB) module to enhance long-range positional awareness and leverage absolute positions from convolutional sub-layers within its transformer blocks. Designed for low-resource settings, EgoFormer offers substantial reductions in inference latency and GPU utilization while maintaining competitive accuracy against the state-of-the-art hand gesture recognition frameworks.","PeriodicalId":447,"journal":{"name":"IEEE Sensors Journal","volume":"24 11","pages":"18133-18140"},"PeriodicalIF":4.3000,"publicationDate":"2024-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"EgoFormer: Ego-Gesture Classification in Context of Autonomous Driving\",\"authors\":\"Tayeba Qazi;M. Rupesh Kumar;Prerana Mukherjee;Brejesh Lall\",\"doi\":\"10.1109/JSEN.2024.3390794\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Decoding the intentions of passengers and other road users remains a critical challenge for autonomous vehicles (AVs) and intelligent transportation systems. Hand gestures are key in these interactions, offering a direct communication channel. Moreover, egocentric videos mimic a first-person perspective, aligning closely with human visual perception. Yet, the development of deep learning algorithms for detecting egocentric hand gestures in autonomous driving is hindered by the absence of useful datasets. Furthermore, there is a pressing need for gesture recognition methods to evolve from convolutional neural network (CNN)-based architectures to transformer models. To address these challenges, we present EgoDriving, a novel dataset of egocentric hand gestures, curated for driving-related hand gestures. Finally, we introduce EgoFormer, an efficient video transformer for egocentric hand gesture classification that is optimized for edge-computing deployments. EgoFormer incorporates a video dynamic position bias (VDPB) module to enhance long-range positional awareness and leverage absolute positions from convolutional sub-layers within its transformer blocks. Designed for low-resource settings, EgoFormer offers substantial reductions in inference latency and GPU utilization while maintaining competitive accuracy against the state-of-the-art hand gesture recognition frameworks.\",\"PeriodicalId\":447,\"journal\":{\"name\":\"IEEE Sensors Journal\",\"volume\":\"24 11\",\"pages\":\"18133-18140\"},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2024-04-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Sensors Journal\",\"FirstCategoryId\":\"103\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10508297/\",\"RegionNum\":2,\"RegionCategory\":\"综合性期刊\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Sensors Journal","FirstCategoryId":"103","ListUrlMain":"https://ieeexplore.ieee.org/document/10508297/","RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

摘要

解码乘客和其他道路使用者的意图仍然是自动驾驶汽车(AV)和智能交通系统面临的一项重大挑战。手势是这些互动的关键,它提供了一个直接的交流渠道。此外,以自我为中心的视频模仿了第一人称视角,与人类的视觉感知非常接近。然而,由于缺乏有用的数据集,用于检测自动驾驶中以自我为中心的手势的深度学习算法的开发受到了阻碍。此外,手势识别方法迫切需要从基于卷积神经网络(CNN)的架构发展到变换器模型。为了应对这些挑战,我们提出了 EgoDriving,这是一个新颖的以自我为中心的手势数据集,专门用于识别与驾驶相关的手势。最后,我们介绍了 EgoFormer,这是一种用于自我中心手势分类的高效视频转换器,针对边缘计算部署进行了优化。EgoFormer 采用了视频动态位置偏置(VDPB)模块,以增强远距离位置感知能力,并利用其转换器模块中卷积子层的绝对位置。EgoFormer 专为低资源环境设计,可大幅减少推理延迟和 GPU 利用率,同时保持与最先进的手势识别框架相比具有竞争力的准确性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
EgoFormer: Ego-Gesture Classification in Context of Autonomous Driving
Decoding the intentions of passengers and other road users remains a critical challenge for autonomous vehicles (AVs) and intelligent transportation systems. Hand gestures are key in these interactions, offering a direct communication channel. Moreover, egocentric videos mimic a first-person perspective, aligning closely with human visual perception. Yet, the development of deep learning algorithms for detecting egocentric hand gestures in autonomous driving is hindered by the absence of useful datasets. Furthermore, there is a pressing need for gesture recognition methods to evolve from convolutional neural network (CNN)-based architectures to transformer models. To address these challenges, we present EgoDriving, a novel dataset of egocentric hand gestures, curated for driving-related hand gestures. Finally, we introduce EgoFormer, an efficient video transformer for egocentric hand gesture classification that is optimized for edge-computing deployments. EgoFormer incorporates a video dynamic position bias (VDPB) module to enhance long-range positional awareness and leverage absolute positions from convolutional sub-layers within its transformer blocks. Designed for low-resource settings, EgoFormer offers substantial reductions in inference latency and GPU utilization while maintaining competitive accuracy against the state-of-the-art hand gesture recognition frameworks.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
IEEE Sensors Journal
IEEE Sensors Journal 工程技术-工程:电子与电气
CiteScore
7.70
自引率
14.00%
发文量
2058
审稿时长
5.2 months
期刊介绍: The fields of interest of the IEEE Sensors Journal are the theory, design , fabrication, manufacturing and applications of devices for sensing and transducing physical, chemical and biological phenomena, with emphasis on the electronics and physics aspect of sensors and integrated sensors-actuators. IEEE Sensors Journal deals with the following: -Sensor Phenomenology, Modelling, and Evaluation -Sensor Materials, Processing, and Fabrication -Chemical and Gas Sensors -Microfluidics and Biosensors -Optical Sensors -Physical Sensors: Temperature, Mechanical, Magnetic, and others -Acoustic and Ultrasonic Sensors -Sensor Packaging -Sensor Networks -Sensor Applications -Sensor Systems: Signals, Processing, and Interfaces -Actuators and Sensor Power Systems -Sensor Signal Processing for high precision and stability (amplification, filtering, linearization, modulation/demodulation) and under harsh conditions (EMC, radiation, humidity, temperature); energy consumption/harvesting -Sensor Data Processing (soft computing with sensor data, e.g., pattern recognition, machine learning, evolutionary computation; sensor data fusion, processing of wave e.g., electromagnetic and acoustic; and non-wave, e.g., chemical, gravity, particle, thermal, radiative and non-radiative sensor data, detection, estimation and classification based on sensor data) -Sensors in Industrial Practice
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信