Two-stage models improve machine learning classifiers in wildlife research: A case study in identifying false positive detections of Ruffed Grouse

IF 5.8 2区 环境科学与生态学 Q1 ECOLOGY
Laurence A. Clarfeld , Katherina D. Gieder , Robert Abrams , Christopher Bernier , Joseph Cahill , Susan Staats , Scott Wixsom , Therese M. Donovan
{"title":"Two-stage models improve machine learning classifiers in wildlife research: A case study in identifying false positive detections of Ruffed Grouse","authors":"Laurence A. Clarfeld ,&nbsp;Katherina D. Gieder ,&nbsp;Robert Abrams ,&nbsp;Christopher Bernier ,&nbsp;Joseph Cahill ,&nbsp;Susan Staats ,&nbsp;Scott Wixsom ,&nbsp;Therese M. Donovan","doi":"10.1016/j.ecoinf.2025.103166","DOIUrl":null,"url":null,"abstract":"<div><div>Autonomous recording units are increasingly being used to monitor wildlife on large geographic and temporal scales, paired with machine learning (ML) to automate detection of wildlife. However, false positive detections from ML classifiers can result in erroneous ecological models that can lead to misguided management and conservation actions. We used a two-stage general approach to understand and reduce false positive detections, a technique in which outputs of the primary classification model are passed to a secondary classification model to yield the probability that a detection from the primary model is a true positive detection. This approach is demonstrated on two open-source models that detect Ruffed Grouse (<em>Bonasa umbellus</em>). We analyzed over 9500 h of acoustic data collected in 2022–2023 from the Green Mountain National Forest in Vermont, USA, and found the two models detected different types of acoustic signals associated with differing life history traits. The first model yielded 4106 detections (71.5 % true positives) while the second model yielded 524 detections (17.0 % true positives). Secondary logistic regression models separated true positives and false positives with high accuracy (84.5 % and 89.8 % respectively). Our findings go beyond improving Ruffed Grouse monitoring and conservation efforts to, more broadly, illustrate how two-stage ML approaches can improve the use of model-derived detections in wildlife research.</div></div>","PeriodicalId":51024,"journal":{"name":"Ecological Informatics","volume":"89 ","pages":"Article 103166"},"PeriodicalIF":5.8000,"publicationDate":"2025-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ecological Informatics","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S157495412500175X","RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ECOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Autonomous recording units are increasingly being used to monitor wildlife on large geographic and temporal scales, paired with machine learning (ML) to automate detection of wildlife. However, false positive detections from ML classifiers can result in erroneous ecological models that can lead to misguided management and conservation actions. We used a two-stage general approach to understand and reduce false positive detections, a technique in which outputs of the primary classification model are passed to a secondary classification model to yield the probability that a detection from the primary model is a true positive detection. This approach is demonstrated on two open-source models that detect Ruffed Grouse (Bonasa umbellus). We analyzed over 9500 h of acoustic data collected in 2022–2023 from the Green Mountain National Forest in Vermont, USA, and found the two models detected different types of acoustic signals associated with differing life history traits. The first model yielded 4106 detections (71.5 % true positives) while the second model yielded 524 detections (17.0 % true positives). Secondary logistic regression models separated true positives and false positives with high accuracy (84.5 % and 89.8 % respectively). Our findings go beyond improving Ruffed Grouse monitoring and conservation efforts to, more broadly, illustrate how two-stage ML approaches can improve the use of model-derived detections in wildlife research.
两阶段模型改进了野生动物研究中的机器学习分类器:识别松鸡假阳性检测的案例研究
自动记录装置越来越多地用于在大地理和时间尺度上监测野生动物,并与机器学习(ML)相结合,自动检测野生动物。然而,来自ML分类器的假阳性检测可能导致错误的生态模型,从而导致错误的管理和保护行动。我们使用了一种两阶段的通用方法来理解和减少假阳性检测,这种技术将主要分类模型的输出传递给二级分类模型,以产生来自主要模型的检测是真阳性检测的概率。这种方法在两个检测Ruffed Grouse (Bonasa umbellus)的开源模型上进行了演示。我们分析了2022-2023年从美国佛蒙特州绿山国家森林收集的9500多小时的声学数据,发现两种模型检测到与不同生活史特征相关的不同类型的声学信号。第一种模型检测出4106例(71.5%真阳性率),第二种模型检测出524例(17.0%真阳性率)。二级逻辑回归模型分离真阳性和假阳性的准确率较高(分别为84.5%和89.8%)。我们的研究结果不仅改善了松鸡的监测和保护工作,更广泛地说,还说明了两阶段机器学习方法如何改善野生动物研究中模型衍生检测的使用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Ecological Informatics
Ecological Informatics 环境科学-生态学
CiteScore
8.30
自引率
11.80%
发文量
346
审稿时长
46 days
期刊介绍: The journal Ecological Informatics is devoted to the publication of high quality, peer-reviewed articles on all aspects of computational ecology, data science and biogeography. The scope of the journal takes into account the data-intensive nature of ecology, the growing capacity of information technology to access, harness and leverage complex data as well as the critical need for informing sustainable management in view of global environmental and climate change. The nature of the journal is interdisciplinary at the crossover between ecology and informatics. It focuses on novel concepts and techniques for image- and genome-based monitoring and interpretation, sensor- and multimedia-based data acquisition, internet-based data archiving and sharing, data assimilation, modelling and prediction of ecological data.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信