Self-supervised feature learning for acoustic data analysis

IF 5.8 2区环境科学与生态学 Q1 ECOLOGY

Ecological Informatics Pub Date : 2024-11-20 DOI:10.1016/j.ecoinf.2024.102878

Ahmet Pala , Anna Oleynik , Ketil Malde , Nils Olav Handegard

{"title":"Self-supervised feature learning for acoustic data analysis","authors":"Ahmet Pala , Anna Oleynik , Ketil Malde , Nils Olav Handegard","doi":"10.1016/j.ecoinf.2024.102878","DOIUrl":null,"url":null,"abstract":"<div><div>Acoustic surveys play a pivotal role in fisheries management. During the surveys, acoustic signals are sent into the water and the strength of the reflection, so-called backscatter, is recorded. The collected data are typically annotated manually, a process that is both labor-intensive and time-consuming, to support acoustic target classification (ATC). The primary objective of this study is to develop an annotation-free deep learning model that extracts acoustic features and improves the representation of acoustic data. For this purpose, we adopt a self-supervised method inspired by the Self DIstillation with NO Labels (DINO) model. Extracting useful acoustic features is an intricate task due to the inherent variability and complexity in biological targets, as well as environmental and technical factors influencing sound interactions. The proposed model is trained with three sampling methods: random sampling, which ignores class imbalance present in the acoustic survey data; class-balanced sampling, which ensures equal representation of known categories; and intensity-based sampling, which selects data to capture backscatter variations. The quality of extracted features is then evaluated and compared. We show that the extracted features lead to improvement, in comparison to using the untreated data, in the discriminative power of several machine learning methods (k-nearest neighbor (kNN), linear regression, multinomial logistic regression) for ATC. The improvement was measured through higher accuracy in kNN (77.55% vs. 71.93%), Macro AUC in logistic regression (0.92 vs. 0.80), and <span><math><msup><mrow><mi>R</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span> in linear regression (0.69 vs. 0.45) when comparing extracted features to the untreated data. Our findings highlight the advantage of applying emerging self-supervised techniques in fisheries acoustics. This study thus contributes to the ongoing efforts to improve the efficiency of acoustic surveys in fisheries management.</div></div>","PeriodicalId":51024,"journal":{"name":"Ecological Informatics","volume":"84 ","pages":"Article 102878"},"PeriodicalIF":5.8000,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ecological Informatics","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1574954124004205","RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ECOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Acoustic surveys play a pivotal role in fisheries management. During the surveys, acoustic signals are sent into the water and the strength of the reflection, so-called backscatter, is recorded. The collected data are typically annotated manually, a process that is both labor-intensive and time-consuming, to support acoustic target classification (ATC). The primary objective of this study is to develop an annotation-free deep learning model that extracts acoustic features and improves the representation of acoustic data. For this purpose, we adopt a self-supervised method inspired by the Self DIstillation with NO Labels (DINO) model. Extracting useful acoustic features is an intricate task due to the inherent variability and complexity in biological targets, as well as environmental and technical factors influencing sound interactions. The proposed model is trained with three sampling methods: random sampling, which ignores class imbalance present in the acoustic survey data; class-balanced sampling, which ensures equal representation of known categories; and intensity-based sampling, which selects data to capture backscatter variations. The quality of extracted features is then evaluated and compared. We show that the extracted features lead to improvement, in comparison to using the untreated data, in the discriminative power of several machine learning methods (k-nearest neighbor (kNN), linear regression, multinomial logistic regression) for ATC. The improvement was measured through higher accuracy in kNN (77.55% vs. 71.93%), Macro AUC in logistic regression (0.92 vs. 0.80), and

R^{2}

in linear regression (0.69 vs. 0.45) when comparing extracted features to the untreated data. Our findings highlight the advantage of applying emerging self-supervised techniques in fisheries acoustics. This study thus contributes to the ongoing efforts to improve the efficiency of acoustic surveys in fisheries management.

查看原文本刊更多论文

用于声学数据分析的自监督特征学习

声学调查在渔业管理中发挥着举足轻重的作用。在勘测过程中，声学信号被送入水中，并记录反射的强度，即所谓的反向散射。为支持声学目标分类（ATC），收集到的数据通常需要人工标注，这一过程既耗费人力又耗费时间。本研究的主要目的是开发一种无需注释的深度学习模型，以提取声学特征并改进声学数据的表示。为此，我们采用了一种自监督方法，其灵感来自无标签自静音（DINO）模型。由于生物目标固有的多变性和复杂性，以及影响声音相互作用的环境和技术因素，提取有用的声音特征是一项复杂的任务。所提出的模型采用三种采样方法进行训练：随机采样，忽略声学调查数据中存在的类别不平衡；类别平衡采样，确保已知类别的平等代表性；基于强度的采样，选择数据以捕捉反向散射变化。然后对提取特征的质量进行评估和比较。我们发现，与使用未经处理的数据相比，提取的特征提高了几种机器学习方法（k-近邻（kNN）、线性回归、多项式逻辑回归）对 ATC 的判别能力。在将提取的特征与未经处理的数据进行比较时，KNN 的准确率（77.55% 对 71.93%）、逻辑回归的宏观 AUC（0.92 对 0.80）和线性回归的 R2（0.69 对 0.45）均有所提高。我们的研究结果凸显了在渔业声学中应用新兴自监督技术的优势。因此，这项研究有助于提高声学调查在渔业管理中的效率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Ecological Informatics 环境科学-生态学

CiteScore

8.30

自引率

11.80%

发文量

346

审稿时长

46 days

期刊介绍： The journal Ecological Informatics is devoted to the publication of high quality, peer-reviewed articles on all aspects of computational ecology, data science and biogeography. The scope of the journal takes into account the data-intensive nature of ecology, the growing capacity of information technology to access, harness and leverage complex data as well as the critical need for informing sustainable management in view of global environmental and climate change. The nature of the journal is interdisciplinary at the crossover between ecology and informatics. It focuses on novel concepts and techniques for image- and genome-based monitoring and interpretation, sensor- and multimedia-based data acquisition, internet-based data archiving and sharing, data assimilation, modelling and prediction of ecological data.