{"title":"HRSpecNET: A Deep Learning-Based High-Resolution Radar Micro-Doppler Signature Reconstruction for Improved HAR Classification","authors":"Sabyasachi Biswas;Ahmed Manavi Alam;Ali C. Gurbuz","doi":"10.1109/TRS.2024.3396172","DOIUrl":null,"url":null,"abstract":"Micro-Doppler signatures (\n<inline-formula> <tex-math>$\\mu $ </tex-math></inline-formula>\n-DSs) are widely used for human activity recognition (HAR) using radar. However, traditional methods for generating \n<inline-formula> <tex-math>$\\mu $ </tex-math></inline-formula>\n-DS, such as the short-time Fourier transform (STFT), suffer from limitations, such as the tradeoff between time and frequency resolution, noise sensitivity, and parameter calibration. To address these limitations, we propose a novel deep learning (DL)-based approach to reconstruct high-resolution \n<inline-formula> <tex-math>$\\mu $ </tex-math></inline-formula>\n-DS directly from a 1-D complex time-domain signal. Our DL architecture consists of an autoencoder (AE) block to improve signal-to-noise ratio (SNR), an STFT block to learn frequency transformations to generate pseudo spectrograms, and, finally, a U-Net block to reconstruct high-resolution spectrogram images. We evaluated our proposed architecture on both synthetic and real-world data. For synthetic data, we generated 1-D complex time-domain signals with multiple time-varying frequencies and evaluated and compared the ability of our network to generate high-resolution \n<inline-formula> <tex-math>$\\mu $ </tex-math></inline-formula>\n-DS and perform in different SNR levels. For real-world data, a challenging radar-based American Sign Language (ASL) dataset consisting of 100 words was used to evaluate the classification performance achieved using the \n<inline-formula> <tex-math>$\\mu $ </tex-math></inline-formula>\n-DS generated by the proposed approach. The results showed that the proposed approach outperforms the classification accuracy of traditional STFT-based \n<inline-formula> <tex-math>$\\mu $ </tex-math></inline-formula>\n-DS by 3.48%. Both synthetic and experimental \n<inline-formula> <tex-math>$\\mu $ </tex-math></inline-formula>\n-DSs show that the proposed approach learns to reconstruct higher resolution and sparser spectrograms.","PeriodicalId":100645,"journal":{"name":"IEEE Transactions on Radar Systems","volume":"2 ","pages":"484-497"},"PeriodicalIF":0.0000,"publicationDate":"2024-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Radar Systems","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10517750/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Micro-Doppler signatures (
$\mu $
-DSs) are widely used for human activity recognition (HAR) using radar. However, traditional methods for generating
$\mu $
-DS, such as the short-time Fourier transform (STFT), suffer from limitations, such as the tradeoff between time and frequency resolution, noise sensitivity, and parameter calibration. To address these limitations, we propose a novel deep learning (DL)-based approach to reconstruct high-resolution
$\mu $
-DS directly from a 1-D complex time-domain signal. Our DL architecture consists of an autoencoder (AE) block to improve signal-to-noise ratio (SNR), an STFT block to learn frequency transformations to generate pseudo spectrograms, and, finally, a U-Net block to reconstruct high-resolution spectrogram images. We evaluated our proposed architecture on both synthetic and real-world data. For synthetic data, we generated 1-D complex time-domain signals with multiple time-varying frequencies and evaluated and compared the ability of our network to generate high-resolution
$\mu $
-DS and perform in different SNR levels. For real-world data, a challenging radar-based American Sign Language (ASL) dataset consisting of 100 words was used to evaluate the classification performance achieved using the
$\mu $
-DS generated by the proposed approach. The results showed that the proposed approach outperforms the classification accuracy of traditional STFT-based
$\mu $
-DS by 3.48%. Both synthetic and experimental
$\mu $
-DSs show that the proposed approach learns to reconstruct higher resolution and sparser spectrograms.