利用声音信号提取的视声特征对鸟类进行分类

D. Lucio, Yandre M. G. Costa
{"title":"利用声音信号提取的视声特征对鸟类进行分类","authors":"D. Lucio, Yandre M. G. Costa","doi":"10.1109/SCCC.2016.7836063","DOIUrl":null,"url":null,"abstract":"This work aims to present a system for automatic bird species classification based on acoustic and visual features extracted from the birdsong. The visual features are extracted from spectrogram images generated from the birdsong audio, while the acoustic features are taken directly from the audio. Texture descriptors were used to describe the spectrogram content, as this is the main visual content found in this kind of image. The texture operators used are Local Binary Pattern (LBP), Local Phase Quantization (LPQ), Robust Local Binary Pattern (RLBP), Gray-Scale Level Co-occurrence Matrix (GLCM) and Gabor filters. The acoustic features are, in turn, described using Rhythm Histogram (RH), Rhythm Patterns (RP) and Statistical Spectrum Descriptor (SSD.) Aiming to perform more fare comparisons, the experiments performed were made on a similar database already used in other works. In the classification step, SVM classifier was used and the final results were taken by using 10-fold cross validation. And over all performed tests the combination between acoustic and visual features produce the best rate of this work 91.08%.","PeriodicalId":432676,"journal":{"name":"2016 35th International Conference of the Chilean Computer Science Society (SCCC)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Bird species classification using visual and acoustic features extracted from audio signal\",\"authors\":\"D. Lucio, Yandre M. G. Costa\",\"doi\":\"10.1109/SCCC.2016.7836063\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This work aims to present a system for automatic bird species classification based on acoustic and visual features extracted from the birdsong. The visual features are extracted from spectrogram images generated from the birdsong audio, while the acoustic features are taken directly from the audio. Texture descriptors were used to describe the spectrogram content, as this is the main visual content found in this kind of image. The texture operators used are Local Binary Pattern (LBP), Local Phase Quantization (LPQ), Robust Local Binary Pattern (RLBP), Gray-Scale Level Co-occurrence Matrix (GLCM) and Gabor filters. The acoustic features are, in turn, described using Rhythm Histogram (RH), Rhythm Patterns (RP) and Statistical Spectrum Descriptor (SSD.) Aiming to perform more fare comparisons, the experiments performed were made on a similar database already used in other works. In the classification step, SVM classifier was used and the final results were taken by using 10-fold cross validation. And over all performed tests the combination between acoustic and visual features produce the best rate of this work 91.08%.\",\"PeriodicalId\":432676,\"journal\":{\"name\":\"2016 35th International Conference of the Chilean Computer Science Society (SCCC)\",\"volume\":\"2 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 35th International Conference of the Chilean Computer Science Society (SCCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SCCC.2016.7836063\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 35th International Conference of the Chilean Computer Science Society (SCCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SCCC.2016.7836063","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

本文旨在提出一种基于鸟鸣声和视觉特征的鸟类自动分类系统。视觉特征从鸟鸣音频生成的频谱图图像中提取,而声学特征直接从音频中提取。纹理描述符用于描述谱图内容,因为这是这类图像中发现的主要视觉内容。使用的纹理算子有局部二值模式(LBP)、局部相位量化(LPQ)、鲁棒局部二值模式(RLBP)、灰度级共生矩阵(GLCM)和Gabor滤波器。声学特征依次用节奏直方图(RH)、节奏模式(RP)和统计频谱描述符(SSD)来描述。为了进行更多的票价比较,所进行的实验是在其他工作中已经使用的类似数据库上进行的。在分类步骤中,使用SVM分类器,并通过10倍交叉验证获得最终结果。在所有进行的测试中,声学和视觉特征的结合产生了91.08%的最佳工作率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Bird species classification using visual and acoustic features extracted from audio signal
This work aims to present a system for automatic bird species classification based on acoustic and visual features extracted from the birdsong. The visual features are extracted from spectrogram images generated from the birdsong audio, while the acoustic features are taken directly from the audio. Texture descriptors were used to describe the spectrogram content, as this is the main visual content found in this kind of image. The texture operators used are Local Binary Pattern (LBP), Local Phase Quantization (LPQ), Robust Local Binary Pattern (RLBP), Gray-Scale Level Co-occurrence Matrix (GLCM) and Gabor filters. The acoustic features are, in turn, described using Rhythm Histogram (RH), Rhythm Patterns (RP) and Statistical Spectrum Descriptor (SSD.) Aiming to perform more fare comparisons, the experiments performed were made on a similar database already used in other works. In the classification step, SVM classifier was used and the final results were taken by using 10-fold cross validation. And over all performed tests the combination between acoustic and visual features produce the best rate of this work 91.08%.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信