{"title":"Bird species classification using visual and acoustic features extracted from audio signal","authors":"D. Lucio, Yandre M. G. Costa","doi":"10.1109/SCCC.2016.7836063","DOIUrl":null,"url":null,"abstract":"This work aims to present a system for automatic bird species classification based on acoustic and visual features extracted from the birdsong. The visual features are extracted from spectrogram images generated from the birdsong audio, while the acoustic features are taken directly from the audio. Texture descriptors were used to describe the spectrogram content, as this is the main visual content found in this kind of image. The texture operators used are Local Binary Pattern (LBP), Local Phase Quantization (LPQ), Robust Local Binary Pattern (RLBP), Gray-Scale Level Co-occurrence Matrix (GLCM) and Gabor filters. The acoustic features are, in turn, described using Rhythm Histogram (RH), Rhythm Patterns (RP) and Statistical Spectrum Descriptor (SSD.) Aiming to perform more fare comparisons, the experiments performed were made on a similar database already used in other works. In the classification step, SVM classifier was used and the final results were taken by using 10-fold cross validation. And over all performed tests the combination between acoustic and visual features produce the best rate of this work 91.08%.","PeriodicalId":432676,"journal":{"name":"2016 35th International Conference of the Chilean Computer Science Society (SCCC)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 35th International Conference of the Chilean Computer Science Society (SCCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SCCC.2016.7836063","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
This work aims to present a system for automatic bird species classification based on acoustic and visual features extracted from the birdsong. The visual features are extracted from spectrogram images generated from the birdsong audio, while the acoustic features are taken directly from the audio. Texture descriptors were used to describe the spectrogram content, as this is the main visual content found in this kind of image. The texture operators used are Local Binary Pattern (LBP), Local Phase Quantization (LPQ), Robust Local Binary Pattern (RLBP), Gray-Scale Level Co-occurrence Matrix (GLCM) and Gabor filters. The acoustic features are, in turn, described using Rhythm Histogram (RH), Rhythm Patterns (RP) and Statistical Spectrum Descriptor (SSD.) Aiming to perform more fare comparisons, the experiments performed were made on a similar database already used in other works. In the classification step, SVM classifier was used and the final results were taken by using 10-fold cross validation. And over all performed tests the combination between acoustic and visual features produce the best rate of this work 91.08%.