Time-frequency Performance Study on Urban Sound Classification with Convolutional Neural Network

H. Shu, Ying Song, Huan Zhou
{"title":"Time-frequency Performance Study on Urban Sound Classification with Convolutional Neural Network","authors":"H. Shu, Ying Song, Huan Zhou","doi":"10.1109/TENCON.2018.8650428","DOIUrl":null,"url":null,"abstract":"Convolutional neural network (ConvNet) is a class of deep feed-forward neural network which exploits the strong spatially local correlation in natural images. It achieves successful performance in visual analyzing area. Recently, ConvNet has been employed in acoustic processing area and been proved to be able to learn the spectro-temporal pattern of sound and differential them for the classification purpose. In this manuscript, the time-frequency resolution of the input sound is studied for their efficiency in the classification accuracy when ConvNet is adopted. Simulation results shows that the data augment solution, which is called multi-width frequency-delta, presents little contribution for the performance improvement when the network is carefully designed. In addition, a suitable temporal resolution in acoustic sound segmentation can achieve good classification effect.","PeriodicalId":132900,"journal":{"name":"TENCON 2018 - 2018 IEEE Region 10 Conference","volume":"64 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"TENCON 2018 - 2018 IEEE Region 10 Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TENCON.2018.8650428","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

Convolutional neural network (ConvNet) is a class of deep feed-forward neural network which exploits the strong spatially local correlation in natural images. It achieves successful performance in visual analyzing area. Recently, ConvNet has been employed in acoustic processing area and been proved to be able to learn the spectro-temporal pattern of sound and differential them for the classification purpose. In this manuscript, the time-frequency resolution of the input sound is studied for their efficiency in the classification accuracy when ConvNet is adopted. Simulation results shows that the data augment solution, which is called multi-width frequency-delta, presents little contribution for the performance improvement when the network is carefully designed. In addition, a suitable temporal resolution in acoustic sound segmentation can achieve good classification effect.
基于卷积神经网络的城市声音分类时频性能研究
卷积神经网络(ConvNet)是一类利用自然图像的强空间局部相关性的深度前馈神经网络。在可视化分析领域取得了成功的表现。近年来,卷积神经网络已被应用于声学处理领域,并被证明能够学习声音的光谱-时间模式,并对其进行分类。本文研究了输入声音的时频分辨率在采用卷积神经网络时对分类精度的影响。仿真结果表明,在精心设计网络时,被称为多宽度频率增量的数据增强方案对网络性能的提高贡献不大。此外,在声学分割中,适当的时间分辨率可以获得良好的分类效果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信