Implementation of Constant-Q Transform (CQT) and Mel Spectrogram to converting Bird’s Sound

Silvester Dian Handy Permana, Ketut Bayu Yogha Bintoro
{"title":"Implementation of Constant-Q Transform (CQT) and Mel Spectrogram to converting Bird’s Sound","authors":"Silvester Dian Handy Permana, Ketut Bayu Yogha Bintoro","doi":"10.1109/COMNETSAT53002.2021.9530779","DOIUrl":null,"url":null,"abstract":"Classification of bird sounds can be done in various methods and ways. One method that can be used is CNN (Convolutional Neural Network). CNN is an algorithm used for image classification. For bird sounds to be classified by CNN, conversion from analogue sound to digital images is required objectively and accurately. This study will discuss the conversion of analogue sound from birds into spectrogram images using one of Constant-Q Transform (CQT) and Mel Spectrogram. Bird voices are recorded using a voice recorder. The recorded voice will represent the audio signal digitally. Constant-Q Transform will map the audio signal from a time domain to a frequency domain. The frequency will be converted into a log scale and the colour dimensions (amplitude) into decibels to form a spectrogram. The spectrogram will be mapped on a mel scale to form a mel spectrogram. This research is the change of bird’s voice analogously to mel spectrogram, classified in CNN. The resulting images from this study can be classified using CNN to help classify bird sounds.","PeriodicalId":148136,"journal":{"name":"2021 IEEE International Conference on Communication, Networks and Satellite (COMNETSAT)","volume":"185 ","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Communication, Networks and Satellite (COMNETSAT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/COMNETSAT53002.2021.9530779","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

Abstract

Classification of bird sounds can be done in various methods and ways. One method that can be used is CNN (Convolutional Neural Network). CNN is an algorithm used for image classification. For bird sounds to be classified by CNN, conversion from analogue sound to digital images is required objectively and accurately. This study will discuss the conversion of analogue sound from birds into spectrogram images using one of Constant-Q Transform (CQT) and Mel Spectrogram. Bird voices are recorded using a voice recorder. The recorded voice will represent the audio signal digitally. Constant-Q Transform will map the audio signal from a time domain to a frequency domain. The frequency will be converted into a log scale and the colour dimensions (amplitude) into decibels to form a spectrogram. The spectrogram will be mapped on a mel scale to form a mel spectrogram. This research is the change of bird’s voice analogously to mel spectrogram, classified in CNN. The resulting images from this study can be classified using CNN to help classify bird sounds.
恒q变换(CQT)和Mel谱图在鸟声转换中的实现
鸟类声音的分类可以用各种方法和方式来完成。一种可以使用的方法是CNN(卷积神经网络)。CNN是一种用于图像分类的算法。通过CNN对鸟的叫声进行分类,需要客观准确地实现模拟声音到数字图像的转换。本研究将讨论使用恒q变换(CQT)和Mel谱图之一将鸟类的模拟声音转换为谱图图像。鸟的声音是用录音机录下来的。录制的声音将以数字方式表示音频信号。常q变换将音频信号从时域映射到频域。频率将转换成对数刻度,颜色尺寸(振幅)转换成分贝,形成频谱图。谱图将被映射到mel尺度上,形成mel谱图。本研究将鸟的声音变化类比为mel谱图,在CNN分类。这项研究的结果图像可以使用CNN进行分类,以帮助分类鸟类的声音。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信