A Deep Transfer Learning Model for the Identification of Bird Songs: A Case Study for Mauritius

Evans Jason Henri, Zahra Mungloo-Dilmohamud
{"title":"A Deep Transfer Learning Model for the Identification of Bird Songs: A Case Study for Mauritius","authors":"Evans Jason Henri, Zahra Mungloo-Dilmohamud","doi":"10.1109/ICECCME52200.2021.9590917","DOIUrl":null,"url":null,"abstract":"Birds communicate with their colonies through sound and inform them of potential problems like forest fires. The identification of bird sounds is therefore very important and has the potential to solve some global problems. Convolutional neural networks (CNNs) are sophisticated deep learning algorithms that have proven to be effective in image processing and in sound classification. This paper describes the work done to develop a tool using a deep learning model for classifying Mauritius bird sounds from audio recordings. A dataset obtained from the Xeno-canto bird song sharing site, which hosts a vast collection of labeled and classified recordings, is used to fine-tune three pre-trained CNN models, namely InceptionV3, MobileNetV2 and RestNet50 and a custom model. The neural network's input is represented by spectrograms created from downloaded mp3 files. Time shifting and pitch stretching have been used for data augmentation. The best performing model has been integrated into a website to identify birds sounds recordings. In this work, transfer learning has been used successfully to produce a model with a weighted accuracy of 84%. Although a custom CNN was trained, better accuracy was achieved through the use of transfer learning.","PeriodicalId":102785,"journal":{"name":"2021 International Conference on Electrical, Computer, Communications and Mechatronics Engineering (ICECCME)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Electrical, Computer, Communications and Mechatronics Engineering (ICECCME)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICECCME52200.2021.9590917","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Birds communicate with their colonies through sound and inform them of potential problems like forest fires. The identification of bird sounds is therefore very important and has the potential to solve some global problems. Convolutional neural networks (CNNs) are sophisticated deep learning algorithms that have proven to be effective in image processing and in sound classification. This paper describes the work done to develop a tool using a deep learning model for classifying Mauritius bird sounds from audio recordings. A dataset obtained from the Xeno-canto bird song sharing site, which hosts a vast collection of labeled and classified recordings, is used to fine-tune three pre-trained CNN models, namely InceptionV3, MobileNetV2 and RestNet50 and a custom model. The neural network's input is represented by spectrograms created from downloaded mp3 files. Time shifting and pitch stretching have been used for data augmentation. The best performing model has been integrated into a website to identify birds sounds recordings. In this work, transfer learning has been used successfully to produce a model with a weighted accuracy of 84%. Although a custom CNN was trained, better accuracy was achieved through the use of transfer learning.
鸟类鸣叫识别的深度迁移学习模型:以毛里求斯为例
鸟类通过声音与它们的群落交流,并告知它们森林火灾等潜在问题。因此,鸟类声音的识别非常重要,并有可能解决一些全球性问题。卷积神经网络(cnn)是一种复杂的深度学习算法,已被证明在图像处理和声音分类方面是有效的。本文描述了开发一种工具所做的工作,该工具使用深度学习模型从录音中对毛里求斯鸟类的声音进行分类。从Xeno-canto鸟类歌曲共享网站获得的数据集(该网站拥有大量标记和分类的录音)用于微调三个预训练的CNN模型,即InceptionV3, MobileNetV2和RestNet50以及一个自定义模型。神经网络的输入由从下载的mp3文件创建的频谱图表示。时移和节距拉伸已被用于数据增强。表现最好的模型已被整合到一个识别鸟类声音记录的网站中。在这项工作中,迁移学习已经成功地用于产生加权精度为84%的模型。虽然训练了自定义CNN,但通过使用迁移学习获得了更好的准确性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信