Ahmet Pala , Anna Oleynik , Ketil Malde , Nils Olav Handegard
{"title":"Self-supervised feature learning for acoustic data analysis","authors":"Ahmet Pala , Anna Oleynik , Ketil Malde , Nils Olav Handegard","doi":"10.1016/j.ecoinf.2024.102878","DOIUrl":null,"url":null,"abstract":"<div><div>Acoustic surveys play a pivotal role in fisheries management. During the surveys, acoustic signals are sent into the water and the strength of the reflection, so-called backscatter, is recorded. The collected data are typically annotated manually, a process that is both labor-intensive and time-consuming, to support acoustic target classification (ATC). The primary objective of this study is to develop an annotation-free deep learning model that extracts acoustic features and improves the representation of acoustic data. For this purpose, we adopt a self-supervised method inspired by the Self DIstillation with NO Labels (DINO) model. Extracting useful acoustic features is an intricate task due to the inherent variability and complexity in biological targets, as well as environmental and technical factors influencing sound interactions. The proposed model is trained with three sampling methods: random sampling, which ignores class imbalance present in the acoustic survey data; class-balanced sampling, which ensures equal representation of known categories; and intensity-based sampling, which selects data to capture backscatter variations. The quality of extracted features is then evaluated and compared. We show that the extracted features lead to improvement, in comparison to using the untreated data, in the discriminative power of several machine learning methods (k-nearest neighbor (kNN), linear regression, multinomial logistic regression) for ATC. The improvement was measured through higher accuracy in kNN (77.55% vs. 71.93%), Macro AUC in logistic regression (0.92 vs. 0.80), and <span><math><msup><mrow><mi>R</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span> in linear regression (0.69 vs. 0.45) when comparing extracted features to the untreated data. Our findings highlight the advantage of applying emerging self-supervised techniques in fisheries acoustics. This study thus contributes to the ongoing efforts to improve the efficiency of acoustic surveys in fisheries management.</div></div>","PeriodicalId":51024,"journal":{"name":"Ecological Informatics","volume":"84 ","pages":"Article 102878"},"PeriodicalIF":5.8000,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ecological Informatics","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1574954124004205","RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ECOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Acoustic surveys play a pivotal role in fisheries management. During the surveys, acoustic signals are sent into the water and the strength of the reflection, so-called backscatter, is recorded. The collected data are typically annotated manually, a process that is both labor-intensive and time-consuming, to support acoustic target classification (ATC). The primary objective of this study is to develop an annotation-free deep learning model that extracts acoustic features and improves the representation of acoustic data. For this purpose, we adopt a self-supervised method inspired by the Self DIstillation with NO Labels (DINO) model. Extracting useful acoustic features is an intricate task due to the inherent variability and complexity in biological targets, as well as environmental and technical factors influencing sound interactions. The proposed model is trained with three sampling methods: random sampling, which ignores class imbalance present in the acoustic survey data; class-balanced sampling, which ensures equal representation of known categories; and intensity-based sampling, which selects data to capture backscatter variations. The quality of extracted features is then evaluated and compared. We show that the extracted features lead to improvement, in comparison to using the untreated data, in the discriminative power of several machine learning methods (k-nearest neighbor (kNN), linear regression, multinomial logistic regression) for ATC. The improvement was measured through higher accuracy in kNN (77.55% vs. 71.93%), Macro AUC in logistic regression (0.92 vs. 0.80), and in linear regression (0.69 vs. 0.45) when comparing extracted features to the untreated data. Our findings highlight the advantage of applying emerging self-supervised techniques in fisheries acoustics. This study thus contributes to the ongoing efforts to improve the efficiency of acoustic surveys in fisheries management.
期刊介绍:
The journal Ecological Informatics is devoted to the publication of high quality, peer-reviewed articles on all aspects of computational ecology, data science and biogeography. The scope of the journal takes into account the data-intensive nature of ecology, the growing capacity of information technology to access, harness and leverage complex data as well as the critical need for informing sustainable management in view of global environmental and climate change.
The nature of the journal is interdisciplinary at the crossover between ecology and informatics. It focuses on novel concepts and techniques for image- and genome-based monitoring and interpretation, sensor- and multimedia-based data acquisition, internet-based data archiving and sharing, data assimilation, modelling and prediction of ecological data.