Classification and mapping of sound sources in local urban streets through AudioSet data and Bayesian optimized Neural Networks

IF 1.7 Q2 ACOUSTICS

Noise Mapping Pub Date : 2019-01-01 DOI:10.1515/noise-2019-0005

Deepank Verma, Arnab Jana, K. Ramamritham

{"title":"Classification and mapping of sound sources in local urban streets through AudioSet data and Bayesian optimized Neural Networks","authors":"Deepank Verma, Arnab Jana, K. Ramamritham","doi":"10.1515/noise-2019-0005","DOIUrl":null,"url":null,"abstract":"Abstract Deep learning (DL) methods have provided several breakthroughs in conventional data analysis techniques, especially with image and audio datasets. Rapid assessment and large-scale quantification of environmental attributes have been possible through such models. This study focuses on the creation of Artificial Neural Networks (ANN) and Recurrent Neural Networks (RNN) based models to classify sound sources from manually collected sound clips in local streets. A subset of an openly available AudioSet data is used to train and evaluate the model against the common sound classes present in the urban streets. The collection of audio data is done at random locations in the selected study area of 0.2 sq. km. The audio clips are further classified according to the extent of anthropogenic (mainly traffic), natural and human-based sounds present in particular locations. Rather than the manual tuning of model hyperparameters, the study utilizes Bayesian Optimization to obtain hyperparameter values of Neural Network models. The optimized models produce an overall accuracy of 89 percent and 60 percent on the evaluation set for three and fifteen-class model respectively. The model detections are mapped in the study area with the help of the Inverse Distance Weighted (IDW) spatial interpolation method.","PeriodicalId":44086,"journal":{"name":"Noise Mapping","volume":"6 1","pages":"52 - 71"},"PeriodicalIF":1.7000,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/noise-2019-0005","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Noise Mapping","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1515/noise-2019-0005","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ACOUSTICS","Score":null,"Total":0}

引用次数: 12

Abstract

Abstract Deep learning (DL) methods have provided several breakthroughs in conventional data analysis techniques, especially with image and audio datasets. Rapid assessment and large-scale quantification of environmental attributes have been possible through such models. This study focuses on the creation of Artificial Neural Networks (ANN) and Recurrent Neural Networks (RNN) based models to classify sound sources from manually collected sound clips in local streets. A subset of an openly available AudioSet data is used to train and evaluate the model against the common sound classes present in the urban streets. The collection of audio data is done at random locations in the selected study area of 0.2 sq. km. The audio clips are further classified according to the extent of anthropogenic (mainly traffic), natural and human-based sounds present in particular locations. Rather than the manual tuning of model hyperparameters, the study utilizes Bayesian Optimization to obtain hyperparameter values of Neural Network models. The optimized models produce an overall accuracy of 89 percent and 60 percent on the evaluation set for three and fifteen-class model respectively. The model detections are mapped in the study area with the help of the Inverse Distance Weighted (IDW) spatial interpolation method.

查看原文本刊更多论文

基于AudioSet数据和贝叶斯优化神经网络的城市街道声源分类与制图

摘要深度学习（DL）方法在传统数据分析技术中取得了一些突破，尤其是在图像和音频数据集方面。通过这些模型，可以对环境属性进行快速评估和大规模量化。本研究的重点是创建基于人工神经网络（ANN）和递归神经网络（RNN）的模型，对当地街道上手动收集的声音片段中的声源进行分类。公开可用的AudioSet数据的子集用于针对城市街道中存在的常见声音类别来训练和评估模型。音频数据的收集是在0.2平方公里的选定研究区域内的随机位置进行的。音频片段根据特定位置存在的人为（主要是交通）、自然和基于人类的声音的程度进行进一步分类。该研究利用贝叶斯优化来获得神经网络模型的超参数值，而不是手动调整模型超参数。优化后的模型对三类和十五类模型的评估集的总体准确率分别为89%和60%。在反距离加权（IDW）空间插值方法的帮助下，将模型检测映射到研究区域。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Noise Mapping ACOUSTICS-

CiteScore

7.80

自引率

17.90%

发文量

审稿时长

12 weeks

期刊介绍： Ever since its inception, Noise Mapping has been offering fast and comprehensive peer-review, while featuring prominent researchers among its Advisory Board. As a result, the journal is set to acquire a growing reputation as the main publication in the field of noise mapping, thus leading to a significant Impact Factor. The journal aims to promote and disseminate knowledge on noise mapping through the publication of high quality peer-reviewed papers focusing on the following aspects: noise mapping and noise action plans: case studies; models and algorithms for source characterization and outdoor sound propagation: proposals, applications, comparisons, round robin tests; local, national and international policies and good practices for noise mapping, planning, management and control; evaluation of noise mitigation actions; evaluation of environmental noise exposure; actions and communications to increase public awareness of environmental noise issues; outdoor soundscape studies and mapping; classification, evaluation and preservation of quiet areas.