Hao Wu;Biao Jin;Zhenkai Zhang;Zhuxian Lian;Baoxiong Xu;Jin Liang;Xiangqun Zhang;Genyuan Du
{"title":"基于轻量级TCNFormer网络的毫米波雷达稀疏驱动手势识别","authors":"Hao Wu;Biao Jin;Zhenkai Zhang;Zhuxian Lian;Baoxiong Xu;Jin Liang;Xiangqun Zhang;Genyuan Du","doi":"10.1109/TIM.2025.3576011","DOIUrl":null,"url":null,"abstract":"Gesture recognition with millimeter-wave radar has broad application prospects in human-computer interaction. However, the traditional recognition methods generate overly redundant features and construct large-scale networks, rendering them unsuitable for embedded devices with limited memory. To address this challenge, we propose a sparse-driven dynamic gesture recognition network in millimeter-wave radar, named time convolutional network and transFormer (TCNFormer). First, we employ a 2-D fast Fourier transform (2D-FFT) to obtain the range-Doppler maps (RDMs). These maps are then processed through incoherent integration of multiple frames to produce Doppler-time maps (DTMs). We subsequently use the orthogonal matching pursuit (OMP) algorithm to achieve a sparse representation of the Doppler-time trajectories and integrate the RDM to extract the range features of gestures, obtaining the multidimensional sparse sequences encompassing the range-Doppler-time feature. We then design a TCNFormer network tailored to these multidimensional sparse sequences. This network leverages a shallow TCN to learn local features, a Transformer network to capture global features and an adaptive weighting method to fuse these local and global features effectively. Experimental results demonstrate that our network fully exploits the sparse multidimensional sequences, achieving a recognition accuracy of 99.17% on a self-built dataset. The parameter size of the network is only 0.13 M, significantly outperforming the existing state-of-the-art models in relevant metrics, thereby proving its suitability for embedded applications in human-computer interaction.","PeriodicalId":13341,"journal":{"name":"IEEE Transactions on Instrumentation and Measurement","volume":"74 ","pages":"1-12"},"PeriodicalIF":5.6000,"publicationDate":"2025-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Sparsity-Driven Gesture Recognition Using Lightweight TCNFormer Networks in Millimeter-Wave Radar\",\"authors\":\"Hao Wu;Biao Jin;Zhenkai Zhang;Zhuxian Lian;Baoxiong Xu;Jin Liang;Xiangqun Zhang;Genyuan Du\",\"doi\":\"10.1109/TIM.2025.3576011\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Gesture recognition with millimeter-wave radar has broad application prospects in human-computer interaction. However, the traditional recognition methods generate overly redundant features and construct large-scale networks, rendering them unsuitable for embedded devices with limited memory. To address this challenge, we propose a sparse-driven dynamic gesture recognition network in millimeter-wave radar, named time convolutional network and transFormer (TCNFormer). First, we employ a 2-D fast Fourier transform (2D-FFT) to obtain the range-Doppler maps (RDMs). These maps are then processed through incoherent integration of multiple frames to produce Doppler-time maps (DTMs). We subsequently use the orthogonal matching pursuit (OMP) algorithm to achieve a sparse representation of the Doppler-time trajectories and integrate the RDM to extract the range features of gestures, obtaining the multidimensional sparse sequences encompassing the range-Doppler-time feature. We then design a TCNFormer network tailored to these multidimensional sparse sequences. This network leverages a shallow TCN to learn local features, a Transformer network to capture global features and an adaptive weighting method to fuse these local and global features effectively. Experimental results demonstrate that our network fully exploits the sparse multidimensional sequences, achieving a recognition accuracy of 99.17% on a self-built dataset. The parameter size of the network is only 0.13 M, significantly outperforming the existing state-of-the-art models in relevant metrics, thereby proving its suitability for embedded applications in human-computer interaction.\",\"PeriodicalId\":13341,\"journal\":{\"name\":\"IEEE Transactions on Instrumentation and Measurement\",\"volume\":\"74 \",\"pages\":\"1-12\"},\"PeriodicalIF\":5.6000,\"publicationDate\":\"2025-06-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Instrumentation and Measurement\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11021466/\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Instrumentation and Measurement","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/11021466/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
Sparsity-Driven Gesture Recognition Using Lightweight TCNFormer Networks in Millimeter-Wave Radar
Gesture recognition with millimeter-wave radar has broad application prospects in human-computer interaction. However, the traditional recognition methods generate overly redundant features and construct large-scale networks, rendering them unsuitable for embedded devices with limited memory. To address this challenge, we propose a sparse-driven dynamic gesture recognition network in millimeter-wave radar, named time convolutional network and transFormer (TCNFormer). First, we employ a 2-D fast Fourier transform (2D-FFT) to obtain the range-Doppler maps (RDMs). These maps are then processed through incoherent integration of multiple frames to produce Doppler-time maps (DTMs). We subsequently use the orthogonal matching pursuit (OMP) algorithm to achieve a sparse representation of the Doppler-time trajectories and integrate the RDM to extract the range features of gestures, obtaining the multidimensional sparse sequences encompassing the range-Doppler-time feature. We then design a TCNFormer network tailored to these multidimensional sparse sequences. This network leverages a shallow TCN to learn local features, a Transformer network to capture global features and an adaptive weighting method to fuse these local and global features effectively. Experimental results demonstrate that our network fully exploits the sparse multidimensional sequences, achieving a recognition accuracy of 99.17% on a self-built dataset. The parameter size of the network is only 0.13 M, significantly outperforming the existing state-of-the-art models in relevant metrics, thereby proving its suitability for embedded applications in human-computer interaction.
期刊介绍:
Papers are sought that address innovative solutions to the development and use of electrical and electronic instruments and equipment to measure, monitor and/or record physical phenomena for the purpose of advancing measurement science, methods, functionality and applications. The scope of these papers may encompass: (1) theory, methodology, and practice of measurement; (2) design, development and evaluation of instrumentation and measurement systems and components used in generating, acquiring, conditioning and processing signals; (3) analysis, representation, display, and preservation of the information obtained from a set of measurements; and (4) scientific and technical support to establishment and maintenance of technical standards in the field of Instrumentation and Measurement.