基于AudioSet数据和贝叶斯优化神经网络的城市街道声源分类与制图

IF 1.7 Q2 ACOUSTICS
Deepank Verma, Arnab Jana, K. Ramamritham
{"title":"基于AudioSet数据和贝叶斯优化神经网络的城市街道声源分类与制图","authors":"Deepank Verma, Arnab Jana, K. Ramamritham","doi":"10.1515/noise-2019-0005","DOIUrl":null,"url":null,"abstract":"Abstract Deep learning (DL) methods have provided several breakthroughs in conventional data analysis techniques, especially with image and audio datasets. Rapid assessment and large-scale quantification of environmental attributes have been possible through such models. This study focuses on the creation of Artificial Neural Networks (ANN) and Recurrent Neural Networks (RNN) based models to classify sound sources from manually collected sound clips in local streets. A subset of an openly available AudioSet data is used to train and evaluate the model against the common sound classes present in the urban streets. The collection of audio data is done at random locations in the selected study area of 0.2 sq. km. The audio clips are further classified according to the extent of anthropogenic (mainly traffic), natural and human-based sounds present in particular locations. Rather than the manual tuning of model hyperparameters, the study utilizes Bayesian Optimization to obtain hyperparameter values of Neural Network models. The optimized models produce an overall accuracy of 89 percent and 60 percent on the evaluation set for three and fifteen-class model respectively. The model detections are mapped in the study area with the help of the Inverse Distance Weighted (IDW) spatial interpolation method.","PeriodicalId":44086,"journal":{"name":"Noise Mapping","volume":null,"pages":null},"PeriodicalIF":1.7000,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/noise-2019-0005","citationCount":"12","resultStr":"{\"title\":\"Classification and mapping of sound sources in local urban streets through AudioSet data and Bayesian optimized Neural Networks\",\"authors\":\"Deepank Verma, Arnab Jana, K. Ramamritham\",\"doi\":\"10.1515/noise-2019-0005\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract Deep learning (DL) methods have provided several breakthroughs in conventional data analysis techniques, especially with image and audio datasets. Rapid assessment and large-scale quantification of environmental attributes have been possible through such models. This study focuses on the creation of Artificial Neural Networks (ANN) and Recurrent Neural Networks (RNN) based models to classify sound sources from manually collected sound clips in local streets. A subset of an openly available AudioSet data is used to train and evaluate the model against the common sound classes present in the urban streets. The collection of audio data is done at random locations in the selected study area of 0.2 sq. km. The audio clips are further classified according to the extent of anthropogenic (mainly traffic), natural and human-based sounds present in particular locations. Rather than the manual tuning of model hyperparameters, the study utilizes Bayesian Optimization to obtain hyperparameter values of Neural Network models. The optimized models produce an overall accuracy of 89 percent and 60 percent on the evaluation set for three and fifteen-class model respectively. The model detections are mapped in the study area with the help of the Inverse Distance Weighted (IDW) spatial interpolation method.\",\"PeriodicalId\":44086,\"journal\":{\"name\":\"Noise Mapping\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":1.7000,\"publicationDate\":\"2019-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1515/noise-2019-0005\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Noise Mapping\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1515/noise-2019-0005\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ACOUSTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Noise Mapping","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1515/noise-2019-0005","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ACOUSTICS","Score":null,"Total":0}
引用次数: 12

摘要

摘要深度学习(DL)方法在传统数据分析技术中取得了一些突破,尤其是在图像和音频数据集方面。通过这些模型,可以对环境属性进行快速评估和大规模量化。本研究的重点是创建基于人工神经网络(ANN)和递归神经网络(RNN)的模型,对当地街道上手动收集的声音片段中的声源进行分类。公开可用的AudioSet数据的子集用于针对城市街道中存在的常见声音类别来训练和评估模型。音频数据的收集是在0.2平方公里的选定研究区域内的随机位置进行的。音频片段根据特定位置存在的人为(主要是交通)、自然和基于人类的声音的程度进行进一步分类。该研究利用贝叶斯优化来获得神经网络模型的超参数值,而不是手动调整模型超参数。优化后的模型对三类和十五类模型的评估集的总体准确率分别为89%和60%。在反距离加权(IDW)空间插值方法的帮助下,将模型检测映射到研究区域。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Classification and mapping of sound sources in local urban streets through AudioSet data and Bayesian optimized Neural Networks
Abstract Deep learning (DL) methods have provided several breakthroughs in conventional data analysis techniques, especially with image and audio datasets. Rapid assessment and large-scale quantification of environmental attributes have been possible through such models. This study focuses on the creation of Artificial Neural Networks (ANN) and Recurrent Neural Networks (RNN) based models to classify sound sources from manually collected sound clips in local streets. A subset of an openly available AudioSet data is used to train and evaluate the model against the common sound classes present in the urban streets. The collection of audio data is done at random locations in the selected study area of 0.2 sq. km. The audio clips are further classified according to the extent of anthropogenic (mainly traffic), natural and human-based sounds present in particular locations. Rather than the manual tuning of model hyperparameters, the study utilizes Bayesian Optimization to obtain hyperparameter values of Neural Network models. The optimized models produce an overall accuracy of 89 percent and 60 percent on the evaluation set for three and fifteen-class model respectively. The model detections are mapped in the study area with the help of the Inverse Distance Weighted (IDW) spatial interpolation method.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Noise Mapping
Noise Mapping ACOUSTICS-
CiteScore
7.80
自引率
17.90%
发文量
5
审稿时长
12 weeks
期刊介绍: Ever since its inception, Noise Mapping has been offering fast and comprehensive peer-review, while featuring prominent researchers among its Advisory Board. As a result, the journal is set to acquire a growing reputation as the main publication in the field of noise mapping, thus leading to a significant Impact Factor. The journal aims to promote and disseminate knowledge on noise mapping through the publication of high quality peer-reviewed papers focusing on the following aspects: noise mapping and noise action plans: case studies; models and algorithms for source characterization and outdoor sound propagation: proposals, applications, comparisons, round robin tests; local, national and international policies and good practices for noise mapping, planning, management and control; evaluation of noise mitigation actions; evaluation of environmental noise exposure; actions and communications to increase public awareness of environmental noise issues; outdoor soundscape studies and mapping; classification, evaluation and preservation of quiet areas.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信