Bryan Sagredo, Sonia Espanol-Jim'enez, Felipe A. Tobar
{"title":"使用时域卷积神经网络检测蓝鲸发声","authors":"Bryan Sagredo, Sonia Espanol-Jim'enez, Felipe A. Tobar","doi":"10.1109/LA-CCI48322.2021.9769846","DOIUrl":null,"url":null,"abstract":"We present a framework for detecting blue whale vocalisations from acoustic submarine recordings. The proposed methodology comprises three stages: i) a preprocessing step where the audio recordings are conditioned through normalisation, filtering, and denoising; ii) a label-propagation mechanism to ensure the consistency of the annotations of the whale vocalisations, and iii) a convolutional neural network that receives audio samples. Based on 34 real-world submarine recordings (28 for training and 6 for testing) we obtained promising performance indicators including an Accuracy of 85.4% and a Recall of 93.5%. Furthermore, even for the cases where our detector did not match the ground-truth labels, a visual inspection validates the ability of our approach to detect possible parts of whale calls unlabelled as such due to not being complete calls.","PeriodicalId":431041,"journal":{"name":"2021 IEEE Latin American Conference on Computational Intelligence (LA-CCI)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Detection of blue whale vocalisations using a temporal-domain convolutional neural network\",\"authors\":\"Bryan Sagredo, Sonia Espanol-Jim'enez, Felipe A. Tobar\",\"doi\":\"10.1109/LA-CCI48322.2021.9769846\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We present a framework for detecting blue whale vocalisations from acoustic submarine recordings. The proposed methodology comprises three stages: i) a preprocessing step where the audio recordings are conditioned through normalisation, filtering, and denoising; ii) a label-propagation mechanism to ensure the consistency of the annotations of the whale vocalisations, and iii) a convolutional neural network that receives audio samples. Based on 34 real-world submarine recordings (28 for training and 6 for testing) we obtained promising performance indicators including an Accuracy of 85.4% and a Recall of 93.5%. Furthermore, even for the cases where our detector did not match the ground-truth labels, a visual inspection validates the ability of our approach to detect possible parts of whale calls unlabelled as such due to not being complete calls.\",\"PeriodicalId\":431041,\"journal\":{\"name\":\"2021 IEEE Latin American Conference on Computational Intelligence (LA-CCI)\",\"volume\":\"16 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-10-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE Latin American Conference on Computational Intelligence (LA-CCI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/LA-CCI48322.2021.9769846\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE Latin American Conference on Computational Intelligence (LA-CCI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/LA-CCI48322.2021.9769846","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Detection of blue whale vocalisations using a temporal-domain convolutional neural network
We present a framework for detecting blue whale vocalisations from acoustic submarine recordings. The proposed methodology comprises three stages: i) a preprocessing step where the audio recordings are conditioned through normalisation, filtering, and denoising; ii) a label-propagation mechanism to ensure the consistency of the annotations of the whale vocalisations, and iii) a convolutional neural network that receives audio samples. Based on 34 real-world submarine recordings (28 for training and 6 for testing) we obtained promising performance indicators including an Accuracy of 85.4% and a Recall of 93.5%. Furthermore, even for the cases where our detector did not match the ground-truth labels, a visual inspection validates the ability of our approach to detect possible parts of whale calls unlabelled as such due to not being complete calls.