{"title":"基于SPWVD-MFCC和双流分类器的噪声环境下城市声音鲁棒分类方法","authors":"Bo Peng, Kevin I-Kai Wang, Waleed H. Abdulla","doi":"10.1007/s40857-025-00350-6","DOIUrl":null,"url":null,"abstract":"<div><p>Urban sound classification is essential for effective sound monitoring and mitigation strategies, which are critical to addressing the negative impacts of noise pollution on public health. While existing methods predominantly rely on Short-Term Fourier Transform (STFT)-based features like Mel-Frequency Cepstral Coefficients (MFCC), these approaches often struggle to identify the dominant sound in noisy environments. This gap in robustness limits the practical deployment of such systems in real-world urban settings, where noise levels are unpredictable and variable. Here, we introduce Smoothed Pseudo-Wigner–Ville Distribution-based MFCC (SPWVD-MFCC), a novel feature that merges SPWVD’s high time–frequency resolution with MFCC’s human-like auditory sensitivity. We further propose a dual-stream ResNet50-CNN-LSTM architecture to classify these features. Comprehensive experiments conducted on UrbanSound8K, UrbanSoundPlus, and DCASE2016 datasets demonstrate that the proposed SPWVD-MFCC significantly improves classification accuracy in noisy conditions, with an enhancement of up to 37.2% over traditional STFT-based methods and better robustness than existing approaches. These results indicate that the proposed approach addresses a critical gap in urban sound classification by providing enhanced robustness in low-SNR environments. This advancement improves the reliability of urban noise monitoring systems and contributes to the broader goal of creating healthier urban living environments by enabling more effective noise-control strategies.</p></div>","PeriodicalId":54355,"journal":{"name":"Acoustics Australia","volume":"53 2","pages":"253 - 268"},"PeriodicalIF":1.8000,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s40857-025-00350-6.pdf","citationCount":"0","resultStr":"{\"title\":\"Robust Classification of Urban Sounds in Noisy Environments: A Novel Approach Using SPWVD-MFCC and Dual-Stream Classifier\",\"authors\":\"Bo Peng, Kevin I-Kai Wang, Waleed H. Abdulla\",\"doi\":\"10.1007/s40857-025-00350-6\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Urban sound classification is essential for effective sound monitoring and mitigation strategies, which are critical to addressing the negative impacts of noise pollution on public health. While existing methods predominantly rely on Short-Term Fourier Transform (STFT)-based features like Mel-Frequency Cepstral Coefficients (MFCC), these approaches often struggle to identify the dominant sound in noisy environments. This gap in robustness limits the practical deployment of such systems in real-world urban settings, where noise levels are unpredictable and variable. Here, we introduce Smoothed Pseudo-Wigner–Ville Distribution-based MFCC (SPWVD-MFCC), a novel feature that merges SPWVD’s high time–frequency resolution with MFCC’s human-like auditory sensitivity. We further propose a dual-stream ResNet50-CNN-LSTM architecture to classify these features. Comprehensive experiments conducted on UrbanSound8K, UrbanSoundPlus, and DCASE2016 datasets demonstrate that the proposed SPWVD-MFCC significantly improves classification accuracy in noisy conditions, with an enhancement of up to 37.2% over traditional STFT-based methods and better robustness than existing approaches. These results indicate that the proposed approach addresses a critical gap in urban sound classification by providing enhanced robustness in low-SNR environments. This advancement improves the reliability of urban noise monitoring systems and contributes to the broader goal of creating healthier urban living environments by enabling more effective noise-control strategies.</p></div>\",\"PeriodicalId\":54355,\"journal\":{\"name\":\"Acoustics Australia\",\"volume\":\"53 2\",\"pages\":\"253 - 268\"},\"PeriodicalIF\":1.8000,\"publicationDate\":\"2025-03-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://link.springer.com/content/pdf/10.1007/s40857-025-00350-6.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Acoustics Australia\",\"FirstCategoryId\":\"101\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s40857-025-00350-6\",\"RegionNum\":4,\"RegionCategory\":\"物理与天体物理\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Acoustics Australia","FirstCategoryId":"101","ListUrlMain":"https://link.springer.com/article/10.1007/s40857-025-00350-6","RegionNum":4,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Robust Classification of Urban Sounds in Noisy Environments: A Novel Approach Using SPWVD-MFCC and Dual-Stream Classifier
Urban sound classification is essential for effective sound monitoring and mitigation strategies, which are critical to addressing the negative impacts of noise pollution on public health. While existing methods predominantly rely on Short-Term Fourier Transform (STFT)-based features like Mel-Frequency Cepstral Coefficients (MFCC), these approaches often struggle to identify the dominant sound in noisy environments. This gap in robustness limits the practical deployment of such systems in real-world urban settings, where noise levels are unpredictable and variable. Here, we introduce Smoothed Pseudo-Wigner–Ville Distribution-based MFCC (SPWVD-MFCC), a novel feature that merges SPWVD’s high time–frequency resolution with MFCC’s human-like auditory sensitivity. We further propose a dual-stream ResNet50-CNN-LSTM architecture to classify these features. Comprehensive experiments conducted on UrbanSound8K, UrbanSoundPlus, and DCASE2016 datasets demonstrate that the proposed SPWVD-MFCC significantly improves classification accuracy in noisy conditions, with an enhancement of up to 37.2% over traditional STFT-based methods and better robustness than existing approaches. These results indicate that the proposed approach addresses a critical gap in urban sound classification by providing enhanced robustness in low-SNR environments. This advancement improves the reliability of urban noise monitoring systems and contributes to the broader goal of creating healthier urban living environments by enabling more effective noise-control strategies.
期刊介绍:
Acoustics Australia, the journal of the Australian Acoustical Society, has been publishing high quality research and technical papers in all areas of acoustics since commencement in 1972. The target audience for the journal includes both researchers and practitioners. It aims to publish papers and technical notes that are relevant to current acoustics and of interest to members of the Society. These include but are not limited to: Architectural and Building Acoustics, Environmental Noise, Underwater Acoustics, Engineering Noise and Vibration Control, Occupational Noise Management, Hearing, Musical Acoustics.