Bryan Sagredo, Sonia Espanol-Jim'enez, Felipe A. Tobar
{"title":"Detection of blue whale vocalisations using a temporal-domain convolutional neural network","authors":"Bryan Sagredo, Sonia Espanol-Jim'enez, Felipe A. Tobar","doi":"10.1109/LA-CCI48322.2021.9769846","DOIUrl":null,"url":null,"abstract":"We present a framework for detecting blue whale vocalisations from acoustic submarine recordings. The proposed methodology comprises three stages: i) a preprocessing step where the audio recordings are conditioned through normalisation, filtering, and denoising; ii) a label-propagation mechanism to ensure the consistency of the annotations of the whale vocalisations, and iii) a convolutional neural network that receives audio samples. Based on 34 real-world submarine recordings (28 for training and 6 for testing) we obtained promising performance indicators including an Accuracy of 85.4% and a Recall of 93.5%. Furthermore, even for the cases where our detector did not match the ground-truth labels, a visual inspection validates the ability of our approach to detect possible parts of whale calls unlabelled as such due to not being complete calls.","PeriodicalId":431041,"journal":{"name":"2021 IEEE Latin American Conference on Computational Intelligence (LA-CCI)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE Latin American Conference on Computational Intelligence (LA-CCI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/LA-CCI48322.2021.9769846","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
We present a framework for detecting blue whale vocalisations from acoustic submarine recordings. The proposed methodology comprises three stages: i) a preprocessing step where the audio recordings are conditioned through normalisation, filtering, and denoising; ii) a label-propagation mechanism to ensure the consistency of the annotations of the whale vocalisations, and iii) a convolutional neural network that receives audio samples. Based on 34 real-world submarine recordings (28 for training and 6 for testing) we obtained promising performance indicators including an Accuracy of 85.4% and a Recall of 93.5%. Furthermore, even for the cases where our detector did not match the ground-truth labels, a visual inspection validates the ability of our approach to detect possible parts of whale calls unlabelled as such due to not being complete calls.