Laurence A. Clarfeld , Katherina D. Gieder , Robert Abrams , Christopher Bernier , Joseph Cahill , Susan Staats , Scott Wixsom , Therese M. Donovan
{"title":"两阶段模型改进了野生动物研究中的机器学习分类器:识别松鸡假阳性检测的案例研究","authors":"Laurence A. Clarfeld , Katherina D. Gieder , Robert Abrams , Christopher Bernier , Joseph Cahill , Susan Staats , Scott Wixsom , Therese M. Donovan","doi":"10.1016/j.ecoinf.2025.103166","DOIUrl":null,"url":null,"abstract":"<div><div>Autonomous recording units are increasingly being used to monitor wildlife on large geographic and temporal scales, paired with machine learning (ML) to automate detection of wildlife. However, false positive detections from ML classifiers can result in erroneous ecological models that can lead to misguided management and conservation actions. We used a two-stage general approach to understand and reduce false positive detections, a technique in which outputs of the primary classification model are passed to a secondary classification model to yield the probability that a detection from the primary model is a true positive detection. This approach is demonstrated on two open-source models that detect Ruffed Grouse (<em>Bonasa umbellus</em>). We analyzed over 9500 h of acoustic data collected in 2022–2023 from the Green Mountain National Forest in Vermont, USA, and found the two models detected different types of acoustic signals associated with differing life history traits. The first model yielded 4106 detections (71.5 % true positives) while the second model yielded 524 detections (17.0 % true positives). Secondary logistic regression models separated true positives and false positives with high accuracy (84.5 % and 89.8 % respectively). Our findings go beyond improving Ruffed Grouse monitoring and conservation efforts to, more broadly, illustrate how two-stage ML approaches can improve the use of model-derived detections in wildlife research.</div></div>","PeriodicalId":51024,"journal":{"name":"Ecological Informatics","volume":"89 ","pages":"Article 103166"},"PeriodicalIF":5.8000,"publicationDate":"2025-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Two-stage models improve machine learning classifiers in wildlife research: A case study in identifying false positive detections of Ruffed Grouse\",\"authors\":\"Laurence A. Clarfeld , Katherina D. Gieder , Robert Abrams , Christopher Bernier , Joseph Cahill , Susan Staats , Scott Wixsom , Therese M. Donovan\",\"doi\":\"10.1016/j.ecoinf.2025.103166\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Autonomous recording units are increasingly being used to monitor wildlife on large geographic and temporal scales, paired with machine learning (ML) to automate detection of wildlife. However, false positive detections from ML classifiers can result in erroneous ecological models that can lead to misguided management and conservation actions. We used a two-stage general approach to understand and reduce false positive detections, a technique in which outputs of the primary classification model are passed to a secondary classification model to yield the probability that a detection from the primary model is a true positive detection. This approach is demonstrated on two open-source models that detect Ruffed Grouse (<em>Bonasa umbellus</em>). We analyzed over 9500 h of acoustic data collected in 2022–2023 from the Green Mountain National Forest in Vermont, USA, and found the two models detected different types of acoustic signals associated with differing life history traits. The first model yielded 4106 detections (71.5 % true positives) while the second model yielded 524 detections (17.0 % true positives). Secondary logistic regression models separated true positives and false positives with high accuracy (84.5 % and 89.8 % respectively). Our findings go beyond improving Ruffed Grouse monitoring and conservation efforts to, more broadly, illustrate how two-stage ML approaches can improve the use of model-derived detections in wildlife research.</div></div>\",\"PeriodicalId\":51024,\"journal\":{\"name\":\"Ecological Informatics\",\"volume\":\"89 \",\"pages\":\"Article 103166\"},\"PeriodicalIF\":5.8000,\"publicationDate\":\"2025-04-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Ecological Informatics\",\"FirstCategoryId\":\"93\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S157495412500175X\",\"RegionNum\":2,\"RegionCategory\":\"环境科学与生态学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ECOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ecological Informatics","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S157495412500175X","RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ECOLOGY","Score":null,"Total":0}
Two-stage models improve machine learning classifiers in wildlife research: A case study in identifying false positive detections of Ruffed Grouse
Autonomous recording units are increasingly being used to monitor wildlife on large geographic and temporal scales, paired with machine learning (ML) to automate detection of wildlife. However, false positive detections from ML classifiers can result in erroneous ecological models that can lead to misguided management and conservation actions. We used a two-stage general approach to understand and reduce false positive detections, a technique in which outputs of the primary classification model are passed to a secondary classification model to yield the probability that a detection from the primary model is a true positive detection. This approach is demonstrated on two open-source models that detect Ruffed Grouse (Bonasa umbellus). We analyzed over 9500 h of acoustic data collected in 2022–2023 from the Green Mountain National Forest in Vermont, USA, and found the two models detected different types of acoustic signals associated with differing life history traits. The first model yielded 4106 detections (71.5 % true positives) while the second model yielded 524 detections (17.0 % true positives). Secondary logistic regression models separated true positives and false positives with high accuracy (84.5 % and 89.8 % respectively). Our findings go beyond improving Ruffed Grouse monitoring and conservation efforts to, more broadly, illustrate how two-stage ML approaches can improve the use of model-derived detections in wildlife research.
期刊介绍:
The journal Ecological Informatics is devoted to the publication of high quality, peer-reviewed articles on all aspects of computational ecology, data science and biogeography. The scope of the journal takes into account the data-intensive nature of ecology, the growing capacity of information technology to access, harness and leverage complex data as well as the critical need for informing sustainable management in view of global environmental and climate change.
The nature of the journal is interdisciplinary at the crossover between ecology and informatics. It focuses on novel concepts and techniques for image- and genome-based monitoring and interpretation, sensor- and multimedia-based data acquisition, internet-based data archiving and sharing, data assimilation, modelling and prediction of ecological data.