生物声学分类后的自动注释：提取声学特征的无监督聚类提高了对隐鸮的检测

IF 7.3 2区环境科学与生态学 Q1 ECOLOGY

Ecological Informatics Pub Date : 2025-05-25 DOI:10.1016/j.ecoinf.2025.103222

Callan Alexander , Robert Clemens , Paul Roe , Susan Fuller

{"title":"生物声学分类后的自动注释：提取声学特征的无监督聚类提高了对隐鸮的检测","authors":"Callan Alexander , Robert Clemens , Paul Roe , Susan Fuller","doi":"10.1016/j.ecoinf.2025.103222","DOIUrl":null,"url":null,"abstract":"<div><div>Passive acoustic monitoring and machine learning are increasingly being used to survey threatened species. When automated detection models are applied to large novel datasets, false-positive detections are likely even for high-performing models, and arbitrary thresholds may result in missed detections. Manual validation of outputs is time consuming, and additional fine-scale annotation of individual notes is impractical for large datasets and difficult to automate when using passive field recordings. This research presents an acoustic monitoring pipeline which employs a multi-stage hybrid approach: initial detection using a convolutional neural network classifier, followed by segmentation and iterative unsupervised clustering of extracted acoustic features using UMAP and HDBSCAN to remove label noise. We applied the pipeline to a large acoustic dataset comprised of 2764 h of environmental recordings and test the utility of the approach on territorial calls of Australia's largest owl: the threatened Powerful Owl (<em>Ninox strenua</em>). The pipeline reduced the large acoustic dataset into 10,116 annotations, of which 9399 (93 %) were correctly annotated individual notes of the target species. The clustering process also eliminated 88 % of false positive detections while retaining 95 % true positives (F1 = 0.94). The approach is highly scalable, can be applied to very large acoustic datasets, and can rapidly collect note-level annotations from noisy field recordings. The acoustic features derived from this methodology identified population differences in our test dataset and enable further exploration of song structure, geographic variation, and vocal individuality. The clustering process also facilitates a semi-supervised learning approach, allowing rapid selection of uncertain examples for model improvement. The pipeline helps to address two key challenges in bioacoustic monitoring: the need for manual validation of automated detections and the difficulty of obtaining accurate note-level annotations in noisy field recordings. Adaptation of these methods to other species and vocalisations may facilitate improved detection and investigation of vocal characteristics across different populations or regions.</div></div>","PeriodicalId":51024,"journal":{"name":"Ecological Informatics","volume":"90 ","pages":"Article 103222"},"PeriodicalIF":7.3000,"publicationDate":"2025-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Automated note annotation after bioacoustic classification: Unsupervised clustering of extracted acoustic features improves detection of a cryptic owl\",\"authors\":\"Callan Alexander , Robert Clemens , Paul Roe , Susan Fuller\",\"doi\":\"10.1016/j.ecoinf.2025.103222\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Passive acoustic monitoring and machine learning are increasingly being used to survey threatened species. When automated detection models are applied to large novel datasets, false-positive detections are likely even for high-performing models, and arbitrary thresholds may result in missed detections. Manual validation of outputs is time consuming, and additional fine-scale annotation of individual notes is impractical for large datasets and difficult to automate when using passive field recordings. This research presents an acoustic monitoring pipeline which employs a multi-stage hybrid approach: initial detection using a convolutional neural network classifier, followed by segmentation and iterative unsupervised clustering of extracted acoustic features using UMAP and HDBSCAN to remove label noise. We applied the pipeline to a large acoustic dataset comprised of 2764 h of environmental recordings and test the utility of the approach on territorial calls of Australia's largest owl: the threatened Powerful Owl (<em>Ninox strenua</em>). The pipeline reduced the large acoustic dataset into 10,116 annotations, of which 9399 (93 %) were correctly annotated individual notes of the target species. The clustering process also eliminated 88 % of false positive detections while retaining 95 % true positives (F1 = 0.94). The approach is highly scalable, can be applied to very large acoustic datasets, and can rapidly collect note-level annotations from noisy field recordings. The acoustic features derived from this methodology identified population differences in our test dataset and enable further exploration of song structure, geographic variation, and vocal individuality. The clustering process also facilitates a semi-supervised learning approach, allowing rapid selection of uncertain examples for model improvement. The pipeline helps to address two key challenges in bioacoustic monitoring: the need for manual validation of automated detections and the difficulty of obtaining accurate note-level annotations in noisy field recordings. Adaptation of these methods to other species and vocalisations may facilitate improved detection and investigation of vocal characteristics across different populations or regions.</div></div>\",\"PeriodicalId\":51024,\"journal\":{\"name\":\"Ecological Informatics\",\"volume\":\"90 \",\"pages\":\"Article 103222\"},\"PeriodicalIF\":7.3000,\"publicationDate\":\"2025-05-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Ecological Informatics\",\"FirstCategoryId\":\"93\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1574954125002316\",\"RegionNum\":2,\"RegionCategory\":\"环境科学与生态学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ECOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ecological Informatics","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1574954125002316","RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ECOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

被动声学监测和机器学习越来越多地被用于调查濒危物种。当自动检测模型应用于大型新数据集时，即使对于高性能模型也可能出现假阳性检测，并且任意阈值可能导致错过检测。手动验证输出是耗时的，并且对于大型数据集来说，单个注释的额外精细注释是不切实际的，并且在使用被动现场记录时难以实现自动化。本研究提出了一种采用多阶段混合方法的声学监测管道：使用卷积神经网络分类器进行初始检测，然后使用UMAP和HDBSCAN对提取的声学特征进行分割和迭代无监督聚类以去除标签噪声。我们将该管道应用于一个由2764小时的环境记录组成的大型声学数据集，并测试了该方法在澳大利亚最大的猫头鹰（受威胁的强力猫头鹰）的领土呼叫中的实用性。该管道将大型声学数据集简化为10,116条注释，其中9399条（93%）正确注释了目标物种的单个注释。聚类过程还消除了88%的假阳性检测，同时保留了95%的真阳性（F1 = 0.94）。该方法具有高度可扩展性，可以应用于非常大的声学数据集，并且可以从嘈杂的现场录音中快速收集笔记级别的注释。从这种方法中得出的声学特征确定了我们测试数据集中的人口差异，并允许进一步探索歌曲结构、地理差异和声乐个性。聚类过程还促进了半监督学习方法，允许快速选择不确定示例以改进模型。该管道有助于解决生物声学监测中的两个关键挑战：需要手动验证自动检测，以及难以在嘈杂的现场录音中获得准确的音符级注释。将这些方法适用于其他物种和发声，可能有助于改进对不同种群或地区声音特征的检测和调查。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Automated note annotation after bioacoustic classification: Unsupervised clustering of extracted acoustic features improves detection of a cryptic owl

Passive acoustic monitoring and machine learning are increasingly being used to survey threatened species. When automated detection models are applied to large novel datasets, false-positive detections are likely even for high-performing models, and arbitrary thresholds may result in missed detections. Manual validation of outputs is time consuming, and additional fine-scale annotation of individual notes is impractical for large datasets and difficult to automate when using passive field recordings. This research presents an acoustic monitoring pipeline which employs a multi-stage hybrid approach: initial detection using a convolutional neural network classifier, followed by segmentation and iterative unsupervised clustering of extracted acoustic features using UMAP and HDBSCAN to remove label noise. We applied the pipeline to a large acoustic dataset comprised of 2764 h of environmental recordings and test the utility of the approach on territorial calls of Australia's largest owl: the threatened Powerful Owl (Ninox strenua). The pipeline reduced the large acoustic dataset into 10,116 annotations, of which 9399 (93 %) were correctly annotated individual notes of the target species. The clustering process also eliminated 88 % of false positive detections while retaining 95 % true positives (F1 = 0.94). The approach is highly scalable, can be applied to very large acoustic datasets, and can rapidly collect note-level annotations from noisy field recordings. The acoustic features derived from this methodology identified population differences in our test dataset and enable further exploration of song structure, geographic variation, and vocal individuality. The clustering process also facilitates a semi-supervised learning approach, allowing rapid selection of uncertain examples for model improvement. The pipeline helps to address two key challenges in bioacoustic monitoring: the need for manual validation of automated detections and the difficulty of obtaining accurate note-level annotations in noisy field recordings. Adaptation of these methods to other species and vocalisations may facilitate improved detection and investigation of vocal characteristics across different populations or regions.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Ecological Informatics 环境科学-生态学

CiteScore

8.30

自引率

11.80%

发文量

346

审稿时长

46 days

期刊介绍： The journal Ecological Informatics is devoted to the publication of high quality, peer-reviewed articles on all aspects of computational ecology, data science and biogeography. The scope of the journal takes into account the data-intensive nature of ecology, the growing capacity of information technology to access, harness and leverage complex data as well as the critical need for informing sustainable management in view of global environmental and climate change. The nature of the journal is interdisciplinary at the crossover between ecology and informatics. It focuses on novel concepts and techniques for image- and genome-based monitoring and interpretation, sensor- and multimedia-based data acquisition, internet-based data archiving and sharing, data assimilation, modelling and prediction of ecological data.