Tao Zhang, Lingguo Kong, Xin Zhao, Donglei Li, Yanzhang Geng, Biyun Ding, Chao Wang
{"title":"基于谱图分解和并行子网络的自监督机器异常声音检测模型","authors":"Tao Zhang, Lingguo Kong, Xin Zhao, Donglei Li, Yanzhang Geng, Biyun Ding, Chao Wang","doi":"10.1007/s10489-025-06366-9","DOIUrl":null,"url":null,"abstract":"<div><p>Anomalous Sound Detection (ASD) has research significance and application prospect industrial automation. Most existing models of ASD have limited ability to effectively utilize machine sound features, leading to reduced stability against sound anomalies and domain shift variations. To address the above issues, we propose a self-supervised ASD model based on spectrogram decomposition and parallel sub-network in this paper. Firstly, we decompose the spectrogram along the time and frequency dimensions to balance feature size and information integrity. This approach emphasizes the temporal and frequency variations in the feature map, facilitating a better understanding of the factors that affect machine sounds under domain shift conditions. Secondly, we design a pair of parallel training sub-networks. The parallel sub-networks employ self-attention mechanisms and shared gradients to effectively capture changes in features across both time and frequency dimensions. This approach improves model stability against anomalies and domain shifts. Finally, the anomaly scores of sub-network branches are fused as anomalous detection results. The performance of the proposed model is validated on DCASE2022 Task2 dataset. The Area under the Receiver Operating Characteristic Curve (AUC) and partial AUC (pAUC) of our model reached 72.89% and 64.83%. The results confirm the effectiveness of the proposed model, achieving better performance.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 6","pages":""},"PeriodicalIF":3.5000,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A self-supervised anomalous machine sound detection model based on spectrogram decomposition and parallel sub-network\",\"authors\":\"Tao Zhang, Lingguo Kong, Xin Zhao, Donglei Li, Yanzhang Geng, Biyun Ding, Chao Wang\",\"doi\":\"10.1007/s10489-025-06366-9\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Anomalous Sound Detection (ASD) has research significance and application prospect industrial automation. Most existing models of ASD have limited ability to effectively utilize machine sound features, leading to reduced stability against sound anomalies and domain shift variations. To address the above issues, we propose a self-supervised ASD model based on spectrogram decomposition and parallel sub-network in this paper. Firstly, we decompose the spectrogram along the time and frequency dimensions to balance feature size and information integrity. This approach emphasizes the temporal and frequency variations in the feature map, facilitating a better understanding of the factors that affect machine sounds under domain shift conditions. Secondly, we design a pair of parallel training sub-networks. The parallel sub-networks employ self-attention mechanisms and shared gradients to effectively capture changes in features across both time and frequency dimensions. This approach improves model stability against anomalies and domain shifts. Finally, the anomaly scores of sub-network branches are fused as anomalous detection results. The performance of the proposed model is validated on DCASE2022 Task2 dataset. The Area under the Receiver Operating Characteristic Curve (AUC) and partial AUC (pAUC) of our model reached 72.89% and 64.83%. The results confirm the effectiveness of the proposed model, achieving better performance.</p></div>\",\"PeriodicalId\":8041,\"journal\":{\"name\":\"Applied Intelligence\",\"volume\":\"55 6\",\"pages\":\"\"},\"PeriodicalIF\":3.5000,\"publicationDate\":\"2025-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s10489-025-06366-9\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Intelligence","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10489-025-06366-9","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
A self-supervised anomalous machine sound detection model based on spectrogram decomposition and parallel sub-network
Anomalous Sound Detection (ASD) has research significance and application prospect industrial automation. Most existing models of ASD have limited ability to effectively utilize machine sound features, leading to reduced stability against sound anomalies and domain shift variations. To address the above issues, we propose a self-supervised ASD model based on spectrogram decomposition and parallel sub-network in this paper. Firstly, we decompose the spectrogram along the time and frequency dimensions to balance feature size and information integrity. This approach emphasizes the temporal and frequency variations in the feature map, facilitating a better understanding of the factors that affect machine sounds under domain shift conditions. Secondly, we design a pair of parallel training sub-networks. The parallel sub-networks employ self-attention mechanisms and shared gradients to effectively capture changes in features across both time and frequency dimensions. This approach improves model stability against anomalies and domain shifts. Finally, the anomaly scores of sub-network branches are fused as anomalous detection results. The performance of the proposed model is validated on DCASE2022 Task2 dataset. The Area under the Receiver Operating Characteristic Curve (AUC) and partial AUC (pAUC) of our model reached 72.89% and 64.83%. The results confirm the effectiveness of the proposed model, achieving better performance.
期刊介绍:
With a focus on research in artificial intelligence and neural networks, this journal addresses issues involving solutions of real-life manufacturing, defense, management, government and industrial problems which are too complex to be solved through conventional approaches and require the simulation of intelligent thought processes, heuristics, applications of knowledge, and distributed and parallel processing. The integration of these multiple approaches in solving complex problems is of particular importance.
The journal presents new and original research and technological developments, addressing real and complex issues applicable to difficult problems. It provides a medium for exchanging scientific research and technological achievements accomplished by the international community.