{"title":"Audio Event Recognition in Noisy Environments using Power Spectral Density and Dimensionality Reduction","authors":"Md Siddat Bin Nesar, Bradley M. Whitaker","doi":"10.1109/ietc54973.2022.9796710","DOIUrl":null,"url":null,"abstract":"Researchers are showing great interest in audio event detection due to its applications in surveillance, audio forensics, and other areas. However, one challenge in event detection is the usual presence of noisy environments. In this paper, we propose a robust system that is reliable when trained on quiet or noisy conditions. Another problem arises when considering the computational costs of collecting and analyzing long audio signals. In this work, we use power spectral density (PSD) and mel-frequency cepstral coefficients (MFCC) for feature extraction. and apply feature transformation and selection techniques to reduce the dimension significantly. Our system exhibits an overall accuracy of 99.05% with the raw features, and 87.10% with a significantly reduced number of features.","PeriodicalId":251518,"journal":{"name":"2022 Intermountain Engineering, Technology and Computing (IETC)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 Intermountain Engineering, Technology and Computing (IETC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ietc54973.2022.9796710","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Researchers are showing great interest in audio event detection due to its applications in surveillance, audio forensics, and other areas. However, one challenge in event detection is the usual presence of noisy environments. In this paper, we propose a robust system that is reliable when trained on quiet or noisy conditions. Another problem arises when considering the computational costs of collecting and analyzing long audio signals. In this work, we use power spectral density (PSD) and mel-frequency cepstral coefficients (MFCC) for feature extraction. and apply feature transformation and selection techniques to reduce the dimension significantly. Our system exhibits an overall accuracy of 99.05% with the raw features, and 87.10% with a significantly reduced number of features.