{"title":"STWWgram-ODCBAM: Multimodal feature fusion and dynamic attention mechanism for anomalous sound detection","authors":"Libin Zheng, Dongsheng Liu, Tong Wu, Yahui Chen","doi":"10.1016/j.sigpro.2025.110218","DOIUrl":null,"url":null,"abstract":"<div><div>Anomalous sound detection (ASD) aims to identify abnormal acoustic patterns emitted by machines or devices, enabling the timely detection of potential malfunctions. In recent years, various approaches have been proposed to extract both temporal and spectral features from audio data to improve detection performance. However, simply concatenating these features often leads to high-dimensional representations containing redundant information, which increases the risk of overfitting and hinders model performance. To address this issue, we propose a novel model based on a dynamic attention mechanism that adaptively selects and emphasizes informative temporal and spectral features while suppressing irrelevant noise. This enhances the quality of feature representation and improves the accuracy of anomaly detection. Moreover, we design a joint learning architecture that simultaneously captures multimodal features from both time and frequency domains, enabling the model to better capture the complex nature of audio signals and enrich the expressiveness of acoustic features. Experimental results demonstrate that the proposed method significantly outperforms state-of-the-art approaches on the DCASE 2020 Challenge Task 2 dataset, achieving AUC and mAUC improvements of 0.40% and 0.88%, respectively. Notably, for the challenging ToyConveyor machine type, our method achieves a remarkable 5.2% improvement in AUC, demonstrating strong robustness and generalization capability.</div></div>","PeriodicalId":49523,"journal":{"name":"Signal Processing","volume":"239 ","pages":"Article 110218"},"PeriodicalIF":3.6000,"publicationDate":"2025-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Signal Processing","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0165168425003329","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Anomalous sound detection (ASD) aims to identify abnormal acoustic patterns emitted by machines or devices, enabling the timely detection of potential malfunctions. In recent years, various approaches have been proposed to extract both temporal and spectral features from audio data to improve detection performance. However, simply concatenating these features often leads to high-dimensional representations containing redundant information, which increases the risk of overfitting and hinders model performance. To address this issue, we propose a novel model based on a dynamic attention mechanism that adaptively selects and emphasizes informative temporal and spectral features while suppressing irrelevant noise. This enhances the quality of feature representation and improves the accuracy of anomaly detection. Moreover, we design a joint learning architecture that simultaneously captures multimodal features from both time and frequency domains, enabling the model to better capture the complex nature of audio signals and enrich the expressiveness of acoustic features. Experimental results demonstrate that the proposed method significantly outperforms state-of-the-art approaches on the DCASE 2020 Challenge Task 2 dataset, achieving AUC and mAUC improvements of 0.40% and 0.88%, respectively. Notably, for the challenging ToyConveyor machine type, our method achieves a remarkable 5.2% improvement in AUC, demonstrating strong robustness and generalization capability.
期刊介绍:
Signal Processing incorporates all aspects of the theory and practice of signal processing. It features original research work, tutorial and review articles, and accounts of practical developments. It is intended for a rapid dissemination of knowledge and experience to engineers and scientists working in the research, development or practical application of signal processing.
Subject areas covered by the journal include: Signal Theory; Stochastic Processes; Detection and Estimation; Spectral Analysis; Filtering; Signal Processing Systems; Software Developments; Image Processing; Pattern Recognition; Optical Signal Processing; Digital Signal Processing; Multi-dimensional Signal Processing; Communication Signal Processing; Biomedical Signal Processing; Geophysical and Astrophysical Signal Processing; Earth Resources Signal Processing; Acoustic and Vibration Signal Processing; Data Processing; Remote Sensing; Signal Processing Technology; Radar Signal Processing; Sonar Signal Processing; Industrial Applications; New Applications.