北海比利时地区经修订的注释未知音群

Frontiers in Remote Sensing Pub Date : 2024-06-04 DOI:10.3389/frsen.2024.1384562

Arienne Calonge, Clea Parcerisas, Elena Schall, E. Debusschere

{"title":"北海比利时地区经修订的注释未知音群","authors":"Arienne Calonge, Clea Parcerisas, Elena Schall, E. Debusschere","doi":"10.3389/frsen.2024.1384562","DOIUrl":null,"url":null,"abstract":"Acoustic signals, especially those of biological source, remain unexplored in the Belgian part of the North Sea (BPNS). The BPNS, although dominated by anthrophony (sounds from human activities), is expected to be acoustically diverse given the presence of biodiverse sandbanks, gravel beds and artificial hard structures. Under the framework of the LifeWatch Broadband Acoustic Network, sound data have been collected since the spring of 2020. These recordings, encompassing both biophony, geophony and anthrophony, have been listened to and annotated for unknown, acoustically salient sounds. To obtain the acoustic features of these annotations, we used two existing automatic feature extractions: the Animal Vocalization Encoder based on Self-Supervision (AVES) and a convolutional autoencoder network (CAE) retrained on the data from this study. An unsupervised density-based clustering algorithm (HDBSCAN) was applied to predict clusters. We coded a grid search function to reduce the dimensionality of the feature sets and to adjust the hyperparameters of HDBSCAN. We searched the hyperparameter space for the most optimized combination of parameter values based on two selected clustering evaluation measures: the homogeneity and the density-based clustering validation (DBCV) scores. Although both feature sets produced meaningful clusters, AVES feature sets resulted in more solid, homogeneous clusters with relatively lower intra-cluster distances, appearing to be more advantageous for the purpose and dataset of this study. The 26 final clusters we obtained were revised by a bioacoustics expert. We were able to name and describe 10 unique sounds, but only clusters named as ‘Jackhammer’ and ‘Tick’ can be interpreted as biological with certainty. Although unsupervised clustering is conventional in ecological research, we highlight its practical use in revising clusters of annotated unknown sounds. The revised clusters we detailed in this study already define a few groups of distinct and recurring sounds that could serve as a preliminary component of a valid annotated training dataset potentially feeding supervised machine learning and classifier models.","PeriodicalId":502669,"journal":{"name":"Frontiers in Remote Sensing","volume":"7 19","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Revised clusters of annotated unknown sounds in the Belgian part of the North sea\",\"authors\":\"Arienne Calonge, Clea Parcerisas, Elena Schall, E. Debusschere\",\"doi\":\"10.3389/frsen.2024.1384562\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Acoustic signals, especially those of biological source, remain unexplored in the Belgian part of the North Sea (BPNS). The BPNS, although dominated by anthrophony (sounds from human activities), is expected to be acoustically diverse given the presence of biodiverse sandbanks, gravel beds and artificial hard structures. Under the framework of the LifeWatch Broadband Acoustic Network, sound data have been collected since the spring of 2020. These recordings, encompassing both biophony, geophony and anthrophony, have been listened to and annotated for unknown, acoustically salient sounds. To obtain the acoustic features of these annotations, we used two existing automatic feature extractions: the Animal Vocalization Encoder based on Self-Supervision (AVES) and a convolutional autoencoder network (CAE) retrained on the data from this study. An unsupervised density-based clustering algorithm (HDBSCAN) was applied to predict clusters. We coded a grid search function to reduce the dimensionality of the feature sets and to adjust the hyperparameters of HDBSCAN. We searched the hyperparameter space for the most optimized combination of parameter values based on two selected clustering evaluation measures: the homogeneity and the density-based clustering validation (DBCV) scores. Although both feature sets produced meaningful clusters, AVES feature sets resulted in more solid, homogeneous clusters with relatively lower intra-cluster distances, appearing to be more advantageous for the purpose and dataset of this study. The 26 final clusters we obtained were revised by a bioacoustics expert. We were able to name and describe 10 unique sounds, but only clusters named as ‘Jackhammer’ and ‘Tick’ can be interpreted as biological with certainty. Although unsupervised clustering is conventional in ecological research, we highlight its practical use in revising clusters of annotated unknown sounds. The revised clusters we detailed in this study already define a few groups of distinct and recurring sounds that could serve as a preliminary component of a valid annotated training dataset potentially feeding supervised machine learning and classifier models.\",\"PeriodicalId\":502669,\"journal\":{\"name\":\"Frontiers in Remote Sensing\",\"volume\":\"7 19\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-06-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Frontiers in Remote Sensing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3389/frsen.2024.1384562\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Remote Sensing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/frsen.2024.1384562","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

北海比利时部分（BPNS）的声学信号，尤其是生物信号，仍未得到探索。虽然比利时北海海域主要是人声（来自人类活动的声音），但由于存在生物多样性的沙岸、砾石床和人工硬结构，预计该海域在声学上是多样化的。在生命观察宽带声学网络的框架下，从 2020 年春季开始收集声音数据。这些录音包括生物声、地理声和人类声，我们对这些录音进行了聆听，并对未知的、声学上突出的声音进行了注释。为了获得这些注释的声学特征，我们使用了两种现有的自动特征提取方法：基于自我监督的动物发声编码器（AVES）和在本研究数据基础上重新训练的卷积自动编码器网络（CAE）。我们采用了一种基于密度的无监督聚类算法（HDBSCAN）来预测聚类。我们编写了一个网格搜索函数来降低特征集的维度，并调整 HDBSCAN 的超参数。我们根据两个选定的聚类评估指标：同质性和基于密度的聚类验证（DBCV）得分，在超参数空间中搜索参数值的最优组合。虽然两种特征集都能产生有意义的聚类，但 AVES 特征集产生的聚类更稳固、更均匀，聚类内部距离相对较小，对于本研究的目的和数据集来说似乎更有优势。生物声学专家对我们最终获得的 26 个聚类进行了修订。我们命名并描述了 10 种独特的声音，但只有被命名为 "Jackhammer "和 "Tick "的聚类可以确定为生物声音。虽然无监督聚类是生态研究中的常规方法，但我们强调了它在修订已注释未知声音聚类中的实际应用。我们在本研究中详细介绍的修订后的聚类已经定义了几组不同的、重复出现的声音，可以作为有效注释训练数据集的初步组成部分，为有监督的机器学习和分类器模型提供数据。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Revised clusters of annotated unknown sounds in the Belgian part of the North sea

Acoustic signals, especially those of biological source, remain unexplored in the Belgian part of the North Sea (BPNS). The BPNS, although dominated by anthrophony (sounds from human activities), is expected to be acoustically diverse given the presence of biodiverse sandbanks, gravel beds and artificial hard structures. Under the framework of the LifeWatch Broadband Acoustic Network, sound data have been collected since the spring of 2020. These recordings, encompassing both biophony, geophony and anthrophony, have been listened to and annotated for unknown, acoustically salient sounds. To obtain the acoustic features of these annotations, we used two existing automatic feature extractions: the Animal Vocalization Encoder based on Self-Supervision (AVES) and a convolutional autoencoder network (CAE) retrained on the data from this study. An unsupervised density-based clustering algorithm (HDBSCAN) was applied to predict clusters. We coded a grid search function to reduce the dimensionality of the feature sets and to adjust the hyperparameters of HDBSCAN. We searched the hyperparameter space for the most optimized combination of parameter values based on two selected clustering evaluation measures: the homogeneity and the density-based clustering validation (DBCV) scores. Although both feature sets produced meaningful clusters, AVES feature sets resulted in more solid, homogeneous clusters with relatively lower intra-cluster distances, appearing to be more advantageous for the purpose and dataset of this study. The 26 final clusters we obtained were revised by a bioacoustics expert. We were able to name and describe 10 unique sounds, but only clusters named as ‘Jackhammer’ and ‘Tick’ can be interpreted as biological with certainty. Although unsupervised clustering is conventional in ecological research, we highlight its practical use in revising clusters of annotated unknown sounds. The revised clusters we detailed in this study already define a few groups of distinct and recurring sounds that could serve as a preliminary component of a valid annotated training dataset potentially feeding supervised machine learning and classifier models.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Frontiers in Remote Sensing

CiteScore

3.90

自引率

0.00%

发文量