用于驾驶员手势识别的轻量级网络

IF 4.9 3区计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Computers & Electrical Engineering Pub Date : 2025-08-18 DOI:10.1016/j.compeleceng.2025.110577

Jingbin Hao , Jianhua Hu , Ci Liang , Xinhua Liu , Xiaokai Sun , Dezheng Hua

{"title":"用于驾驶员手势识别的轻量级网络","authors":"Jingbin Hao , Jianhua Hu , Ci Liang , Xinhua Liu , Xiaokai Sun , Dezheng Hua","doi":"10.1016/j.compeleceng.2025.110577","DOIUrl":null,"url":null,"abstract":"<div><div>In the domain of smart car technology, driver gesture recognition models often face the challenge of balancing efficiency with accuracy, typically requiring substantial computational resources and memory. To address these challenges, this paper introduces a lightweight network structure, Intelligent Cockpit Gesture recognition-You Only Look Once version 7 (ICG-YOLOv7), based on YOLOv7. The contributions of this study include proposing an improved Convolutional Block Attention Module (CBAM) to enhance feature extraction and devising an unstructured pruning method to compress the model. Specifically, to achieve continuous feature recalibration, a residual connection structure is designed using the concept of residual learning. Moreover, to prevent the loss of feature information, a Four Conv BN SiLU Module (FCBSM) structure is designed and integrated into YOLOv7, which retains important original feature information. Furthermore, a self-made cockpit environment gesture dataset is developed, on which comparative experiments and ablation experiments are conducted. An unstructured pruning method based on layer adaptive sparsification is then designed to compress the improved network, thereby enhancing the model's generalization ability and significantly reducing its computational effort, number of parameters, and size. Experimental results demonstrate that the proposed approach effectively integrates neural networks and deep learning into vehicle development. Finally, a gradient-weighted class activation map (Grad-CAM) method is utilized to visualize and analyze the pruned model, thereby enhancing the interpretability of the gesture recognition model and facilitating its deployment in vehicle systems to accomplish the task of driver gesture recognition in cockpit environments.</div></div>","PeriodicalId":50630,"journal":{"name":"Computers & Electrical Engineering","volume":"127 ","pages":"Article 110577"},"PeriodicalIF":4.9000,"publicationDate":"2025-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A lightweight network for driver gesture recognition\",\"authors\":\"Jingbin Hao , Jianhua Hu , Ci Liang , Xinhua Liu , Xiaokai Sun , Dezheng Hua\",\"doi\":\"10.1016/j.compeleceng.2025.110577\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>In the domain of smart car technology, driver gesture recognition models often face the challenge of balancing efficiency with accuracy, typically requiring substantial computational resources and memory. To address these challenges, this paper introduces a lightweight network structure, Intelligent Cockpit Gesture recognition-You Only Look Once version 7 (ICG-YOLOv7), based on YOLOv7. The contributions of this study include proposing an improved Convolutional Block Attention Module (CBAM) to enhance feature extraction and devising an unstructured pruning method to compress the model. Specifically, to achieve continuous feature recalibration, a residual connection structure is designed using the concept of residual learning. Moreover, to prevent the loss of feature information, a Four Conv BN SiLU Module (FCBSM) structure is designed and integrated into YOLOv7, which retains important original feature information. Furthermore, a self-made cockpit environment gesture dataset is developed, on which comparative experiments and ablation experiments are conducted. An unstructured pruning method based on layer adaptive sparsification is then designed to compress the improved network, thereby enhancing the model's generalization ability and significantly reducing its computational effort, number of parameters, and size. Experimental results demonstrate that the proposed approach effectively integrates neural networks and deep learning into vehicle development. Finally, a gradient-weighted class activation map (Grad-CAM) method is utilized to visualize and analyze the pruned model, thereby enhancing the interpretability of the gesture recognition model and facilitating its deployment in vehicle systems to accomplish the task of driver gesture recognition in cockpit environments.</div></div>\",\"PeriodicalId\":50630,\"journal\":{\"name\":\"Computers & Electrical Engineering\",\"volume\":\"127 \",\"pages\":\"Article 110577\"},\"PeriodicalIF\":4.9000,\"publicationDate\":\"2025-08-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers & Electrical Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0045790625005208\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Electrical Engineering","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0045790625005208","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

摘要

在智能汽车技术领域，驾驶员手势识别模型经常面临平衡效率和准确性的挑战，通常需要大量的计算资源和内存。为了解决这些挑战，本文介绍了一种轻量级网络结构，即基于YOLOv7的智能驾驶舱手势识别-你只看一次版本7 （ICG-YOLOv7）。本研究的贡献包括提出一种改进的卷积块注意模块（CBAM）来增强特征提取，并设计一种非结构化修剪方法来压缩模型。具体来说，为了实现连续特征重标定，利用残差学习的概念设计了残差连接结构。此外，为了防止特征信息的丢失，YOLOv7设计并集成了四转换BN SiLU模块（FCBSM）结构，保留了重要的原始特征信息。建立了自制的座舱环境手势数据集，并在此基础上进行了对比实验和烧蚀实验。然后设计了一种基于层自适应稀疏化的非结构化剪枝方法对改进后的网络进行压缩，从而增强了模型的泛化能力，显著降低了模型的计算量、参数数量和大小。实验结果表明，该方法有效地将神经网络和深度学习集成到车辆开发中。最后，利用梯度加权类激活图（gradient-weighted class activation map, Grad-CAM）方法对裁剪后的模型进行可视化分析，增强手势识别模型的可解释性，促进手势识别模型在车辆系统中的部署，完成驾驶舱环境下驾驶员手势识别的任务。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A lightweight network for driver gesture recognition

In the domain of smart car technology, driver gesture recognition models often face the challenge of balancing efficiency with accuracy, typically requiring substantial computational resources and memory. To address these challenges, this paper introduces a lightweight network structure, Intelligent Cockpit Gesture recognition-You Only Look Once version 7 (ICG-YOLOv7), based on YOLOv7. The contributions of this study include proposing an improved Convolutional Block Attention Module (CBAM) to enhance feature extraction and devising an unstructured pruning method to compress the model. Specifically, to achieve continuous feature recalibration, a residual connection structure is designed using the concept of residual learning. Moreover, to prevent the loss of feature information, a Four Conv BN SiLU Module (FCBSM) structure is designed and integrated into YOLOv7, which retains important original feature information. Furthermore, a self-made cockpit environment gesture dataset is developed, on which comparative experiments and ablation experiments are conducted. An unstructured pruning method based on layer adaptive sparsification is then designed to compress the improved network, thereby enhancing the model's generalization ability and significantly reducing its computational effort, number of parameters, and size. Experimental results demonstrate that the proposed approach effectively integrates neural networks and deep learning into vehicle development. Finally, a gradient-weighted class activation map (Grad-CAM) method is utilized to visualize and analyze the pruned model, thereby enhancing the interpretability of the gesture recognition model and facilitating its deployment in vehicle systems to accomplish the task of driver gesture recognition in cockpit environments.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computers & Electrical Engineering 工程技术-工程：电子与电气

CiteScore

9.20

自引率

7.00%

发文量

661

审稿时长

47 days

期刊介绍： The impact of computers has nowhere been more revolutionary than in electrical engineering. The design, analysis, and operation of electrical and electronic systems are now dominated by computers, a transformation that has been motivated by the natural ease of interface between computers and electrical systems, and the promise of spectacular improvements in speed and efficiency. Published since 1973, Computers & Electrical Engineering provides rapid publication of topical research into the integration of computer technology and computational techniques with electrical and electronic systems. The journal publishes papers featuring novel implementations of computers and computational techniques in areas like signal and image processing, high-performance computing, parallel processing, and communications. Special attention will be paid to papers describing innovative architectures, algorithms, and software tools.