Jan Huus , Kevin G. Kelly , Erin M. Bayne , Elly C. Knight
{"title":"HawkEars: A regional, high-performance avian acoustic classifier","authors":"Jan Huus , Kevin G. Kelly , Erin M. Bayne , Elly C. Knight","doi":"10.1016/j.ecoinf.2025.103122","DOIUrl":null,"url":null,"abstract":"<div><div>Passive acoustic monitoring is rapidly emerging as a dominant approach for studying acoustic wildlife, with neural networks used as an increasingly common and promising approach for extracting detections of particular species from acoustic recordings. Existing options for avian classifiers include small custom models for focal species or large models that attempt to classify the entire global avian community, which suggests a possible tradeoff between classifier performance and species coverage. We argue that building domain-specific classifiers for particular geographic regions provides improved performance in exchange for reduced species coverage and present HawkEars, a regional avian classifier for Canada that includes 314 bird and 13 amphibian species. A major challenge in classifier development is the weak labeling of open access datasets. We developed a novel solution, using embedding-based search to efficiently generate strong labels. We evaluated HawkEars performance for bird species relative to two prominent avian community classifiers: BirdNET, and Perch for two datasets representing two applications: bird community surveys and studies of vocal activity rate. We found HawkEars had substantially higher performance across all metrics, detected on average two more species per recording minute in our community evaluation dataset, and had a recall of nearly twice Perch and four times BirdNET, given a precision of 0.9, for our vocal activity evaluation dataset. We suggest HawkEars provides better classification performance because a smaller species pool allows for more resources allocated per species to training and tuning and reduces the risk of class overlap, and our strong labeling method ensures high-quality training data. While our classifier, HawkEars, is a substantial improvement for practitioners studying acoustic wildlife in Canada and the northern United States, practitioners in other regions can use the HawkEars open-source code to build classifiers for other geographic regions. By continuing to improve deep-learning classification performance, HawkEars has the potential to substantially improve the efficiency and utility of passive acoustic monitoring studies.</div></div>","PeriodicalId":51024,"journal":{"name":"Ecological Informatics","volume":"87 ","pages":"Article 103122"},"PeriodicalIF":5.8000,"publicationDate":"2025-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ecological Informatics","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1574954125001311","RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ECOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Passive acoustic monitoring is rapidly emerging as a dominant approach for studying acoustic wildlife, with neural networks used as an increasingly common and promising approach for extracting detections of particular species from acoustic recordings. Existing options for avian classifiers include small custom models for focal species or large models that attempt to classify the entire global avian community, which suggests a possible tradeoff between classifier performance and species coverage. We argue that building domain-specific classifiers for particular geographic regions provides improved performance in exchange for reduced species coverage and present HawkEars, a regional avian classifier for Canada that includes 314 bird and 13 amphibian species. A major challenge in classifier development is the weak labeling of open access datasets. We developed a novel solution, using embedding-based search to efficiently generate strong labels. We evaluated HawkEars performance for bird species relative to two prominent avian community classifiers: BirdNET, and Perch for two datasets representing two applications: bird community surveys and studies of vocal activity rate. We found HawkEars had substantially higher performance across all metrics, detected on average two more species per recording minute in our community evaluation dataset, and had a recall of nearly twice Perch and four times BirdNET, given a precision of 0.9, for our vocal activity evaluation dataset. We suggest HawkEars provides better classification performance because a smaller species pool allows for more resources allocated per species to training and tuning and reduces the risk of class overlap, and our strong labeling method ensures high-quality training data. While our classifier, HawkEars, is a substantial improvement for practitioners studying acoustic wildlife in Canada and the northern United States, practitioners in other regions can use the HawkEars open-source code to build classifiers for other geographic regions. By continuing to improve deep-learning classification performance, HawkEars has the potential to substantially improve the efficiency and utility of passive acoustic monitoring studies.
期刊介绍:
The journal Ecological Informatics is devoted to the publication of high quality, peer-reviewed articles on all aspects of computational ecology, data science and biogeography. The scope of the journal takes into account the data-intensive nature of ecology, the growing capacity of information technology to access, harness and leverage complex data as well as the critical need for informing sustainable management in view of global environmental and climate change.
The nature of the journal is interdisciplinary at the crossover between ecology and informatics. It focuses on novel concepts and techniques for image- and genome-based monitoring and interpretation, sensor- and multimedia-based data acquisition, internet-based data archiving and sharing, data assimilation, modelling and prediction of ecological data.