Sumair Aziz , Muhammad Umar Khan , Adil Usman , Muhammad Faraz , Yazeed Yasin Ghadi , Gabriel Axel Montes
{"title":"Bearing faults classification using novel log energy-based empirical mode decomposition and machine Mel-frequency cepstral coefficients","authors":"Sumair Aziz , Muhammad Umar Khan , Adil Usman , Muhammad Faraz , Yazeed Yasin Ghadi , Gabriel Axel Montes","doi":"10.1016/j.dsp.2024.104776","DOIUrl":"10.1016/j.dsp.2024.104776","url":null,"abstract":"<div><p>The accurate diagnosis of faults in bearing components is crucial for the safe and efficient operation of electrical and power drives. These machines generate sound and vibration signals that indicate their operational state. While vibration signals are often utilized for fault diagnosis, they require costly transducers. On the other hand, sound signal transducers are more affordable, but their lower signal-to-noise ratio complicates the differentiation between healthy and faulty bearings. This paper addresses these challenges by introducing a machine sound-based bearing fault diagnosis system. The proposed method employs a novel Log Energy-based Empirical Mode Decomposition and Reconstruction for advanced sound preprocessing. Feature extraction is performed using Machine Mel-frequency Cepstral Coefficients, with feature selection facilitated by a Genetic Algorithm. Classification is achieved through Support Vector Machines. The system demonstrated a high classification accuracy of 99.26% on the SUBF v2.0 dataset, outperforming other diagnostic methods, even in noisy conditions. This approach is particularly suited for industrial applications, offering a reliable solution for preventing downtime and ensuring the reliability of equipment.</p></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"156 ","pages":"Article 104776"},"PeriodicalIF":2.9,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142238998","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bin Jiang , Hao Wu , Qingling Xia , Gen Li , Hanguang Xiao , Yun Zhao
{"title":"NKDFF-CNN: A convolutional neural network with narrow kernel and dual-view feature fusion for multitype gesture recognition based on sEMG","authors":"Bin Jiang , Hao Wu , Qingling Xia , Gen Li , Hanguang Xiao , Yun Zhao","doi":"10.1016/j.dsp.2024.104772","DOIUrl":"10.1016/j.dsp.2024.104772","url":null,"abstract":"<div><p>Deep learning algorithms have been widely applied to gesture recognition based on multi-channel surface electromyography (sEMG). However, the limitations in feature extraction capabilities of existing algorithms have restricted the performance of multitype gesture recognition. To address this challenge, we propose a novel sEMG-based gesture recognition algorithm, namely, Narrow Kernel and Dual-view Feature Fusion Convolutional Neural Network (NKDFF-CNN). Firstly, to overcome the issue of traditional square kernel convolution operation, which loses channel independence features, we employ the narrow kernel convolution in the model to learn time-related features in each independent channel of sEMG, resulting in obtaining representative correlation information between specific muscles and gestures. Then, the dual-view structure is used to capture both shallow and deep features, which are fused at the decision level. Thus, the multi-dimensional feature information is extracted. The NKDFF-CNN is further extended to ACC<img>NKDFF-CNN by introducing acceleration signals for multimodal feature integration. Experimental validation on the NinaPro DB2 dataset demonstrates the superior classification performance of NKDFF-CNN, achieving 88.03 % accuracy for 49 hand gestures, outperforming other state-of-the-art MSFF-net. In addition, the ACC<img>NKDFF-CNN model with multimodal feature information significantly improved the accuracy to 95.25 %. We also validated the proposed NKDFF-CNN on NinaPro DB3 with the disabled subjects and the NinaPro DB4 with healthy subjects. The results showcased that the NKDFF-CNN achieved advanced accuracies of 70.58 % and 85.91 % for the multitype hand gestures classification, respectively, showing the high generalization ability of the proposed model. As a consequence, the proposed NKDFF-CNN method achieved superior recognition performance in both accuracy and generality compared to other advanced models. Thus, it provides a reliable algorithm for research in fields such as rehabilitative medicine.</p></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"156 ","pages":"Article 104772"},"PeriodicalIF":2.9,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142271046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Maximum correntropy polynomial chaos Kalman filter for underwater navigation","authors":"Rohit Kumar Singh, Joydeb Saha, Shovan Bhaumik","doi":"10.1016/j.dsp.2024.104774","DOIUrl":"10.1016/j.dsp.2024.104774","url":null,"abstract":"<div><p>This paper develops an underwater navigation solution that utilizes a strapdown inertial navigation system (SINS) and fuses a set of auxiliary sensors such as an acoustic positioning system, Doppler velocity log, depth meter, and magnetometer to accurately estimate an underwater vessel's position and orientation. The conventional integrated navigation system assumes Gaussian measurement noise, while in reality, the noises are non-Gaussian, particularly contaminated by heavy-tailed impulsive noises. To address this issue, and to fuse the system model with the acquired sensor measurements efficiently, we develop a square root polynomial chaos Kalman filter based on maximum correntropy criteria. The proposed method uses Hermite polynomial chaos expansion to tackle the nonlinearity, and it has the potential to estimate the states in a more accurate way in presence of a non-Gaussian measurement noise. The filter is initialized using acoustic beaconing to accurately locate the initial position of the vehicle. The computational complexity of the proposed filter is calculated in terms of flops count. The proposed method is compared with the existing maximum correntropy sigma point filters in terms of estimation accuracy and computational complexity. It is found from the simulation results that the proposed method is more accurate compared to the conventional deterministic sample point filters and Huber's M-estimator.</p></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"155 ","pages":"Article 104774"},"PeriodicalIF":2.9,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142229805","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tao Chen , Baochuan Qiu , Jinxin Li , Xiongrong Cai
{"title":"An end-to-end radar pulse deinterleaving structure based on point cloud mapping","authors":"Tao Chen , Baochuan Qiu , Jinxin Li , Xiongrong Cai","doi":"10.1016/j.dsp.2024.104773","DOIUrl":"10.1016/j.dsp.2024.104773","url":null,"abstract":"<div><p>Radar pulse deinterleaving is a critical technology of electronic reconnaissance equipment. This paper proposes an end-to-end radar pulses deinterleaving structure based on point cloud mapping. The core idea is mapping radar pulse description word (PDW) to a point cloud for mimetic vision, which converts the radar pulse deinterleaving task into a point cloud segmentation task. This structure is characterized by lightweight and strong generalization compared to the image segmentation-based deinterleaving structure. Then this paper proposes a multi-stage graph convolution network (MSGCN) based on graph convolution for point cloud segmentation, which utilises the message passing mechanism of the graph structure to effectively extract, pass and fuse the features of different pulses, thus achieving better segmentation performance. The simulation experimental results show that the proposed method can effectively realize the deinterleaving of densely interleaved and overlapped pulses, and the method has an excellent robustness in pulse missing and spurious pulse interference scenarios.</p></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"155 ","pages":"Article 104773"},"PeriodicalIF":2.9,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142238198","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bowen Zhao , Hongdou He , Hang Xu , Peng Shi , Xiaobing Hao , Guoyan Huang
{"title":"RTIA-Mono: Real-time lightweight self-supervised monocular depth estimation with global-local information aggregation","authors":"Bowen Zhao , Hongdou He , Hang Xu , Peng Shi , Xiaobing Hao , Guoyan Huang","doi":"10.1016/j.dsp.2024.104769","DOIUrl":"10.1016/j.dsp.2024.104769","url":null,"abstract":"<div><p>Self-supervised monocular depth estimation has attracted significant attention in computer vision, especially for applications like autonomous driving and robotics. Recently, CNNs and Transformers have achieved tremendous success in this task. However, existing research primarily focuses on improving estimation accuracy, increasing model complexity poses challenges for deployment on edge computing devices. Shallow CNNs aid lightweight network construction but suffer limited receptive fields, hindering fusion of local geometric features and global semantic information. To address these issues, we propose an efficient real-time lightweight self-supervised architecture, RTIA-Mono, for monocular depth estimation. Firstly, we design a cross-stage feature fusion structure promoting feature aggregation and fusion across stages. Secondly, in each stage, we propose a Global Local Information Aggregation (GLIA) module integrating advantages of CNNs and Transformers to aggregate local and global features. Additionally, we introduce a Directional Feature Enhancement (DFE) module supplementing spatial structure information to mitigate spatial information loss from downsampling. Through sophisticated design, the proposed approach outperforms state-of-the-art methods on KITTI benchmark with the least parameters, and achieves a good balance between accuracy, complexity and inference speed. Furthermore, RTIA-Mono demonstrates excellent generalization on other datasets.</p></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"156 ","pages":"Article 104769"},"PeriodicalIF":2.9,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142271045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hermitian random walk graph Fourier transform for directed graphs and its applications","authors":"Deyun Wei, Shuangxiao Yuan","doi":"10.1016/j.dsp.2024.104751","DOIUrl":"10.1016/j.dsp.2024.104751","url":null,"abstract":"<div><p>Signal processing on directed graphs present additional challenges since a complete set of eigenvectors is unavailable generally. To solve this problem, in this paper, a novel graph Fourier transform is constructed for representing and processing signals on directed graphs. Firstly, we introduce a Hermitian random walk Laplacian operator and derive that it is Hermitian positive semi-definite. Hence, the obtained Laplacian operator is diagonalizable and yields orthogonal eigenvectors as graph Fourier basis. Secondly, we propose the Hermitian random walk graph Fourier transform (HRWGFT) with good properties including unitary and preserving inner products. Furthermore, HRWGFT records the directionality of edges without sacrificing the information about the graph signal. Then, using these favorable properties, we derive spectral convolution to define the graph filter which is the core tool for processing graph signals. Finally, based on the proposed Laplacian matrix and HRWGFT, we present several applications on synthetic and real-world networks, including signal denoising, data classification. The rationality and validity of our work are verified by simulations.</p></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"155 ","pages":"Article 104751"},"PeriodicalIF":2.9,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142238199","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Diffusion random Fourier adaptive filtering algorithm based on logistic distance metric for distributed estimation","authors":"Zhe Wu, Jingen Ni","doi":"10.1016/j.dsp.2024.104768","DOIUrl":"10.1016/j.dsp.2024.104768","url":null,"abstract":"<div><p>Distributed adaptive filtering over networks can improve filtering performance by fusing information from nodes within the same neighbor. In nonlinear estimation, adaptive filters derived from a linear framework usually suffer from large misalignment. To solve the above problem, this work develops a diffusion kernel filtering algorithm based on the random Fourier approximation method. To promote robustness to impulsive noise, the minimum logistic distance metric (LDM) is employed as a loss function. Compared to traditional kernel algorithms, the presented algorithm uses a fixed-length filter and is suitable for online distributed adaptive filtering tasks. In addition, this work also conducts a performance analysis based on Isserlis' and Price's theorems with several statistical assumptions. Simulations are conducted to exhibit the robustness of the proposed method to impulsive noise and to examine the accuracy of the theory on performance analysis.</p></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"156 ","pages":"Article 104768"},"PeriodicalIF":2.9,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142238996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"FSM-YOLO: Apple leaf disease detection network based on adaptive feature capture and spatial context awareness","authors":"Chunman Yan, Kangyi Yang","doi":"10.1016/j.dsp.2024.104770","DOIUrl":"10.1016/j.dsp.2024.104770","url":null,"abstract":"<div><p>Apple leaf disease is a key factor affecting apple yield. Detecting apple leaf diseases in unstructured environments presents a significant challenge due to the diverse early forms and varying scales of the diseases, as well as the similarity between the diseased areas and the background. To address these challenges, this paper proposes an improved convolutional neural network FSM-YOLO with adaptive feature capture and spatial context awareness. Firstly, to address the lack of feature extraction due to the complex texture structure of disease features, AFEM (Adaptive Feature Enhancement Module) with the ability of contextual information fusion and channel information modulation is proposed, which enhances the feature extraction capability for multiple disease types. Secondly, SCAA (Spatial Context-aware Attention) module with spatial relationship capture and adaptive receptive field adjustment was designed to enhance the network's ability to spatial relationship modeling and its ability to focus on disease characteristics to distinguish between disease targets and background information. Finally, MKMC (Multi-kernel mixed Convolution) is proposed to enhance multi-scale feature extraction capability by efficiently capturing and integrating information at multiple spatial resolutions to cope with different scales and shape variations of early leaf disease types. Experiments were conducted on an apple leaf disease dataset covering eight different disease types with 15,159 disease instances, and the experimental results show that compared with the baseline model YOLOv8s, FSM-YOLO improves [email protected] by 2.7%, precision by 2.0%, and recall by 4.0%. Meanwhile, experimental results on the open-source apple leaf disease dataset ALDOD and plant leaf disease dataset PlantDoc show that FSM-YOLO outperforms the state-of-the-art algorithms, which validates the versatility of FSM-YOLO and confirms its excellent detection performance in various plant disease scenarios.</p></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"155 ","pages":"Article 104770"},"PeriodicalIF":2.9,"publicationDate":"2024-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142238248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hehao Zhang, Zhengping Hu, Shuai Bi, Jirui Di, Zhe Sun
{"title":"Relation-aware interaction spatio-temporal network for 3D human pose estimation","authors":"Hehao Zhang, Zhengping Hu, Shuai Bi, Jirui Di, Zhe Sun","doi":"10.1016/j.dsp.2024.104764","DOIUrl":"10.1016/j.dsp.2024.104764","url":null,"abstract":"<div><p>3D human pose estimation is a fundamental task in analyzing human behavior, which has many practical applications. However, existing methods suffer from high time complexity and weak capability to acquire the relations at the human joint level and the spatio-temporal level. To this end, the <strong>R</strong>elation-aware <strong>I</strong>nteraction <strong>S</strong>patio-temporal <strong>Net</strong>work (RISNet) is presented to achieve a better speed-accuracy trade-off in a parallel interactive architecture. Firstly, the Spatial Kinematics Modeling Block (SKMB) is proposed to encode spatially positional correlations among human joints, thereby capturing cross-joint kinematic dependencies in each frame. Secondly, the Temporal Trajectory Modeling Block (TTMB) is employed to further process the temporal motion trajectory of individual joints at several various frame scales. Besides, the bi-directional interaction modules across branches are presented to enhance modeling abilities at the spatio-temporal level. Experiments on Human 3.6M, HumanEva-I and MPI-INF-3DHP benchmarks indicate that the RISNet gains significant improvement compared to several state-of-the-art techniques. In conclusion, the proposed approach elegantly extracts critical features of body joints in the spatio-temporal domain with fewer model parameters and lower time complexity.</p></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"155 ","pages":"Article 104764"},"PeriodicalIF":2.9,"publicationDate":"2024-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142162912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yi Li , Huiying Xu , Xinzhong Zhu , Xiao Huang , Hongbo Li
{"title":"THDet: A Lightweight and Efficient Traffic Helmet Object Detector based on YOLOv8","authors":"Yi Li , Huiying Xu , Xinzhong Zhu , Xiao Huang , Hongbo Li","doi":"10.1016/j.dsp.2024.104765","DOIUrl":"10.1016/j.dsp.2024.104765","url":null,"abstract":"<div><p>Traffic helmet object detection is playing an increasing important role in the smart traffic fields. However, object size variation and small-shaped helmet detection has still been a challenging problem by reason of their poor visual appearance in the image. In this work, we present an efficient traffic helmet detector through feature enhancement and lightweight design based on YOLOv8n called THDet. Specifically, we employ the coordinate attention into C2f blocks combined with softmax activate function to achieve feature channel aggregation and strong non-linear expression of the backbone for further effective feature extraction; Next, Focal_CIoU loss function embedded with Focal Loss method is utilized for the more precise measure of various objects bounding box regression and balance of positive and negative examples during training; Then, a new lightweight detection head style is designed only with two proper position heads (P3 & P4) to perform final classification and localization, through this scheme saving the 33.7% parameters than baseline method. Finally, Attention Refined Features Module (ARFM) is built to calibrate the multi-scale fused features by introducing 3-D weights generated from SimAttention to boost the final detection accuracy. Extensive experiments have demonstrated that our proposed method realizes noticeable performance in terms of detection accuracy and inference speed compared with baseline YOLOv8n and many end-to-end detectors of similar model size. Concretely, THDet achieves 0.447 at the overall evaluation metric of <span><math><mi>m</mi><mi>A</mi><msub><mrow><mi>P</mi></mrow><mrow><mn>0.5</mn><mo>−</mo><mn>0.95</mn></mrow></msub></math></span>, accomplishing 3.2% detection accuracy improvement than YOLOv8n. Besides, THDet only holds 2.2M parameters with 295 FPS inference speed, reducing 33.4% parameters compared with YOLOv8n. The experimental results validate the effectiveness of our proposed method, showcasing that THDet outperforms the mainstream real-time detection algorithms in the terms of accuracy, inference speed and lightweight model design for traffic helmet object detection.</p></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"155 ","pages":"Article 104765"},"PeriodicalIF":2.9,"publicationDate":"2024-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142173869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}