Jingbin Hao , Jianhua Hu , Ci Liang , Xinhua Liu , Xiaokai Sun , Dezheng Hua
{"title":"A lightweight network for driver gesture recognition","authors":"Jingbin Hao , Jianhua Hu , Ci Liang , Xinhua Liu , Xiaokai Sun , Dezheng Hua","doi":"10.1016/j.compeleceng.2025.110577","DOIUrl":null,"url":null,"abstract":"<div><div>In the domain of smart car technology, driver gesture recognition models often face the challenge of balancing efficiency with accuracy, typically requiring substantial computational resources and memory. To address these challenges, this paper introduces a lightweight network structure, Intelligent Cockpit Gesture recognition-You Only Look Once version 7 (ICG-YOLOv7), based on YOLOv7. The contributions of this study include proposing an improved Convolutional Block Attention Module (CBAM) to enhance feature extraction and devising an unstructured pruning method to compress the model. Specifically, to achieve continuous feature recalibration, a residual connection structure is designed using the concept of residual learning. Moreover, to prevent the loss of feature information, a Four Conv BN SiLU Module (FCBSM) structure is designed and integrated into YOLOv7, which retains important original feature information. Furthermore, a self-made cockpit environment gesture dataset is developed, on which comparative experiments and ablation experiments are conducted. An unstructured pruning method based on layer adaptive sparsification is then designed to compress the improved network, thereby enhancing the model's generalization ability and significantly reducing its computational effort, number of parameters, and size. Experimental results demonstrate that the proposed approach effectively integrates neural networks and deep learning into vehicle development. Finally, a gradient-weighted class activation map (Grad-CAM) method is utilized to visualize and analyze the pruned model, thereby enhancing the interpretability of the gesture recognition model and facilitating its deployment in vehicle systems to accomplish the task of driver gesture recognition in cockpit environments.</div></div>","PeriodicalId":50630,"journal":{"name":"Computers & Electrical Engineering","volume":"127 ","pages":"Article 110577"},"PeriodicalIF":4.9000,"publicationDate":"2025-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Electrical Engineering","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0045790625005208","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
In the domain of smart car technology, driver gesture recognition models often face the challenge of balancing efficiency with accuracy, typically requiring substantial computational resources and memory. To address these challenges, this paper introduces a lightweight network structure, Intelligent Cockpit Gesture recognition-You Only Look Once version 7 (ICG-YOLOv7), based on YOLOv7. The contributions of this study include proposing an improved Convolutional Block Attention Module (CBAM) to enhance feature extraction and devising an unstructured pruning method to compress the model. Specifically, to achieve continuous feature recalibration, a residual connection structure is designed using the concept of residual learning. Moreover, to prevent the loss of feature information, a Four Conv BN SiLU Module (FCBSM) structure is designed and integrated into YOLOv7, which retains important original feature information. Furthermore, a self-made cockpit environment gesture dataset is developed, on which comparative experiments and ablation experiments are conducted. An unstructured pruning method based on layer adaptive sparsification is then designed to compress the improved network, thereby enhancing the model's generalization ability and significantly reducing its computational effort, number of parameters, and size. Experimental results demonstrate that the proposed approach effectively integrates neural networks and deep learning into vehicle development. Finally, a gradient-weighted class activation map (Grad-CAM) method is utilized to visualize and analyze the pruned model, thereby enhancing the interpretability of the gesture recognition model and facilitating its deployment in vehicle systems to accomplish the task of driver gesture recognition in cockpit environments.
期刊介绍:
The impact of computers has nowhere been more revolutionary than in electrical engineering. The design, analysis, and operation of electrical and electronic systems are now dominated by computers, a transformation that has been motivated by the natural ease of interface between computers and electrical systems, and the promise of spectacular improvements in speed and efficiency.
Published since 1973, Computers & Electrical Engineering provides rapid publication of topical research into the integration of computer technology and computational techniques with electrical and electronic systems. The journal publishes papers featuring novel implementations of computers and computational techniques in areas like signal and image processing, high-performance computing, parallel processing, and communications. Special attention will be paid to papers describing innovative architectures, algorithms, and software tools.