基于时谱信息融合的异常声事件检测方法

IF 0.8 Q4 OPTICS

Optical Memory and Neural Networks Pub Date : 2025-02-03 DOI:10.3103/S1060992X24700814

Changgeng Yu, Chaowen He, Dashi Lin

{"title":"基于时谱信息融合的异常声事件检测方法","authors":"Changgeng Yu, Chaowen He, Dashi Lin","doi":"10.3103/S1060992X24700814","DOIUrl":null,"url":null,"abstract":"In this paper, we propose an abnormal sound event detection method based on Time-Frequency Spectral Information Fusion Neural Network (TFSIFNN), addressing the problem that the time structure and frequency information of sound events in real environment are widely varied, resulting in poor performance of abnormal sound event detection. First, we construct a TCN-BiLSTM network based on Temporal Convolutional Networks (TCN) and Bidirectional Long Short-Term Memory (BiLSTM) networks to extract the temporal context information from sound events. Next, we enhance the feature learning capability of the MobileNetV3 network through Efficient Channel Attention (ECA), culminating in the design of an ECA-MobileNetV3 network to capture the spectral information within sound events. Finally, a TFSIFNN model was established based on TCN-BiLSTM and ECA-MobileNetV3 to improve the performance of abnormal sound event detection. The experimental results, conducted on the Urbansound8K and TUT Rare Sound Events 2017 datasets, demonstrate that our TFSIFNN model achieved notable performance improvements. Specifically, it reached an accuracy of 93.93% and an F1-Score of 94.15% on the Urbansound8K dataset. On the TUT Rare Sound Events 2017 dataset, compared to the baseline method, the error rate on the evaluation set decreased by 0.55, and the F1-Score improved by 29.69%.","PeriodicalId":721,"journal":{"name":"Optical Memory and Neural Networks","volume":"33 4","pages":"411 - 421"},"PeriodicalIF":0.8000,"publicationDate":"2025-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Abnormal Sound Event Detection Method Based on Time-Spectrum Information Fusion\",\"authors\":\"Changgeng Yu, Chaowen He, Dashi Lin\",\"doi\":\"10.3103/S1060992X24700814\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we propose an abnormal sound event detection method based on Time-Frequency Spectral Information Fusion Neural Network (TFSIFNN), addressing the problem that the time structure and frequency information of sound events in real environment are widely varied, resulting in poor performance of abnormal sound event detection. First, we construct a TCN-BiLSTM network based on Temporal Convolutional Networks (TCN) and Bidirectional Long Short-Term Memory (BiLSTM) networks to extract the temporal context information from sound events. Next, we enhance the feature learning capability of the MobileNetV3 network through Efficient Channel Attention (ECA), culminating in the design of an ECA-MobileNetV3 network to capture the spectral information within sound events. Finally, a TFSIFNN model was established based on TCN-BiLSTM and ECA-MobileNetV3 to improve the performance of abnormal sound event detection. The experimental results, conducted on the Urbansound8K and TUT Rare Sound Events 2017 datasets, demonstrate that our TFSIFNN model achieved notable performance improvements. Specifically, it reached an accuracy of 93.93% and an F1-Score of 94.15% on the Urbansound8K dataset. On the TUT Rare Sound Events 2017 dataset, compared to the baseline method, the error rate on the evaluation set decreased by 0.55, and the F1-Score improved by 29.69%.\",\"PeriodicalId\":721,\"journal\":{\"name\":\"Optical Memory and Neural Networks\",\"volume\":\"33 4\",\"pages\":\"411 - 421\"},\"PeriodicalIF\":0.8000,\"publicationDate\":\"2025-02-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Optical Memory and Neural Networks\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://link.springer.com/article/10.3103/S1060992X24700814\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"OPTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Optical Memory and Neural Networks","FirstCategoryId":"1085","ListUrlMain":"https://link.springer.com/article/10.3103/S1060992X24700814","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"OPTICS","Score":null,"Total":0}

引用次数: 0

摘要

本文针对真实环境中声音事件的时间结构和频率信息变化较大，导致异常声音事件检测性能不佳的问题，提出了一种基于时频信息融合神经网络（TFSIFNN）的异常声音事件检测方法。首先，我们基于时间卷积网络（TCN）和双向长短期记忆网络（BiLSTM）构建了TCN-BiLSTM网络，从声音事件中提取时间上下文信息。接下来，我们通过高效通道注意（ECA）增强MobileNetV3网络的特征学习能力，最终设计了ECA-MobileNetV3网络，以捕获声音事件中的频谱信息。最后，基于TCN-BiLSTM和ECA-MobileNetV3建立了TFSIFNN模型，提高了异常声事件的检测性能。在Urbansound8K和TUT Rare Sound Events 2017数据集上进行的实验结果表明，我们的TFSIFNN模型取得了显著的性能改进。具体来说，它在Urbansound8K数据集上达到了93.93%的准确率和94.15%的F1-Score。在TUT Rare Sound Events 2017数据集上，与基线方法相比，评估集的错误率降低了0.55，F1-Score提高了29.69%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Abnormal Sound Event Detection Method Based on Time-Spectrum Information Fusion

查看原文本刊更多论文

Abnormal Sound Event Detection Method Based on Time-Spectrum Information Fusion

In this paper, we propose an abnormal sound event detection method based on Time-Frequency Spectral Information Fusion Neural Network (TFSIFNN), addressing the problem that the time structure and frequency information of sound events in real environment are widely varied, resulting in poor performance of abnormal sound event detection. First, we construct a TCN-BiLSTM network based on Temporal Convolutional Networks (TCN) and Bidirectional Long Short-Term Memory (BiLSTM) networks to extract the temporal context information from sound events. Next, we enhance the feature learning capability of the MobileNetV3 network through Efficient Channel Attention (ECA), culminating in the design of an ECA-MobileNetV3 network to capture the spectral information within sound events. Finally, a TFSIFNN model was established based on TCN-BiLSTM and ECA-MobileNetV3 to improve the performance of abnormal sound event detection. The experimental results, conducted on the Urbansound8K and TUT Rare Sound Events 2017 datasets, demonstrate that our TFSIFNN model achieved notable performance improvements. Specifically, it reached an accuracy of 93.93% and an F1-Score of 94.15% on the Urbansound8K dataset. On the TUT Rare Sound Events 2017 dataset, compared to the baseline method, the error rate on the evaluation set decreased by 0.55, and the F1-Score improved by 29.69%.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Optical Memory and Neural Networks OPTICS-

CiteScore

1.50

自引率

11.10%

发文量

期刊介绍： The journal covers a wide range of issues in information optics such as optical memory, mechanisms for optical data recording and processing, photosensitive materials, optical, optoelectronic and holographic nanostructures, and many other related topics. Papers on memory systems using holographic and biological structures and concepts of brain operation are also included. The journal pays particular attention to research in the field of neural net systems that may lead to a new generation of computional technologies by endowing them with intelligence.