基于卷积神经网络的动物声音分类

Emre Sasmaz, F. Tek
{"title":"基于卷积神经网络的动物声音分类","authors":"Emre Sasmaz, F. Tek","doi":"10.1109/UBMK.2018.8566449","DOIUrl":null,"url":null,"abstract":"In this paper, we investigate the problem of animal sound classification using deep learning and propose a system based on convolutional neural network architecture. As the input to the network, sound files were preprocessed to extract Mel Frequency Cepstral Coefficients (MFCC) using LibROSA library. To train and test the system we have collected 875 animal sound samples from an online sound source site for 10 different animal types. We report classification confusion matrices and the results obtained by different gradient descent optimizers. The best accuracy of 75% was obtained by Nesterov-accelerated Adaptive Moment Estimation (Nadam).","PeriodicalId":293249,"journal":{"name":"2018 3rd International Conference on Computer Science and Engineering (UBMK)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"21","resultStr":"{\"title\":\"Animal Sound Classification Using A Convolutional Neural Network\",\"authors\":\"Emre Sasmaz, F. Tek\",\"doi\":\"10.1109/UBMK.2018.8566449\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we investigate the problem of animal sound classification using deep learning and propose a system based on convolutional neural network architecture. As the input to the network, sound files were preprocessed to extract Mel Frequency Cepstral Coefficients (MFCC) using LibROSA library. To train and test the system we have collected 875 animal sound samples from an online sound source site for 10 different animal types. We report classification confusion matrices and the results obtained by different gradient descent optimizers. The best accuracy of 75% was obtained by Nesterov-accelerated Adaptive Moment Estimation (Nadam).\",\"PeriodicalId\":293249,\"journal\":{\"name\":\"2018 3rd International Conference on Computer Science and Engineering (UBMK)\",\"volume\":\"6 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"21\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 3rd International Conference on Computer Science and Engineering (UBMK)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/UBMK.2018.8566449\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 3rd International Conference on Computer Science and Engineering (UBMK)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/UBMK.2018.8566449","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 21

摘要

本文研究了基于深度学习的动物声音分类问题,提出了一种基于卷积神经网络架构的动物声音分类系统。作为网络的输入,使用LibROSA库对声音文件进行预处理,提取Mel频率倒谱系数(MFCC)。为了训练和测试该系统,我们从一个在线声源网站收集了875个动物声音样本,涵盖10种不同的动物类型。我们报告了分类混淆矩阵和不同梯度下降优化器得到的结果。采用nesterov加速自适应矩估计(Nadam)获得了75%的最佳精度。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Animal Sound Classification Using A Convolutional Neural Network
In this paper, we investigate the problem of animal sound classification using deep learning and propose a system based on convolutional neural network architecture. As the input to the network, sound files were preprocessed to extract Mel Frequency Cepstral Coefficients (MFCC) using LibROSA library. To train and test the system we have collected 875 animal sound samples from an online sound source site for 10 different animal types. We report classification confusion matrices and the results obtained by different gradient descent optimizers. The best accuracy of 75% was obtained by Nesterov-accelerated Adaptive Moment Estimation (Nadam).
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信