Owen S. Okuley , Christina M. Aiello , Will Glad , Kyle Perkins , Richard Ianniello , Neal Darby , Clinton W. Epps
{"title":"Improving AI performance in wildlife monitoring through species and environment-specific training: A case study on desert Bighorn sheep","authors":"Owen S. Okuley , Christina M. Aiello , Will Glad , Kyle Perkins , Richard Ianniello , Neal Darby , Clinton W. Epps","doi":"10.1016/j.ecoinf.2025.103179","DOIUrl":null,"url":null,"abstract":"<div><div>Motion-activated cameras are widely used to monitor wildlife, offering a non-intrusive and cost-effective means to collect high volumes of data. Artificial intelligence (AI) models can expedite image processing, but automated species classifications can be too inaccurate to meet end-users' needs. This study evaluates how selection of data for model training influences AI detection of a focal species (desert bighorn sheep; <em>Ovis canadensis nelsoni</em>) across similar and novel locations. We compared two AI models: a species-specialist (deep_sheep) and a species-generalist (CameraTrapDetectoR), identified sources of bias, and retrained the specialist model using two datasets targeted toward biases associated with classification failure. Testing on 95,547 images from 36 water sources (5 novel) in the Mojave and Sonoran Deserts revealed the specialist model outperformed the generalist by 21.44 % in accuracy and reduced false negatives by 45.18 %. The specialist model was retrained first on site-representative data, then on both site-representative and extreme image-condition data. Retraining iterations consecutively reduced the false negative rate (36.94 % → 6.23 % → 4.67 %) and improved reliability across sites at the cost of a reciprocal increase in false positive rate (2.87 % → 15.22 % → 23.97 %) and variation. The site-representative retraining had the highest overall accuracy. Accuracy at out-of-sample sites remained comparable to the full dataset, though minor performance declines were observed. We found that specifying an AI's training to single-species classification and conditions within specific environments produced robust classification accuracy at minimal data requirements. By narrowing objectives while ensuring adequate training data variety, we achieved 89.33 % accuracy with a small fraction of the training data required by similar performing models.</div></div>","PeriodicalId":51024,"journal":{"name":"Ecological Informatics","volume":"89 ","pages":"Article 103179"},"PeriodicalIF":5.8000,"publicationDate":"2025-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ecological Informatics","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1574954125001888","RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ECOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Motion-activated cameras are widely used to monitor wildlife, offering a non-intrusive and cost-effective means to collect high volumes of data. Artificial intelligence (AI) models can expedite image processing, but automated species classifications can be too inaccurate to meet end-users' needs. This study evaluates how selection of data for model training influences AI detection of a focal species (desert bighorn sheep; Ovis canadensis nelsoni) across similar and novel locations. We compared two AI models: a species-specialist (deep_sheep) and a species-generalist (CameraTrapDetectoR), identified sources of bias, and retrained the specialist model using two datasets targeted toward biases associated with classification failure. Testing on 95,547 images from 36 water sources (5 novel) in the Mojave and Sonoran Deserts revealed the specialist model outperformed the generalist by 21.44 % in accuracy and reduced false negatives by 45.18 %. The specialist model was retrained first on site-representative data, then on both site-representative and extreme image-condition data. Retraining iterations consecutively reduced the false negative rate (36.94 % → 6.23 % → 4.67 %) and improved reliability across sites at the cost of a reciprocal increase in false positive rate (2.87 % → 15.22 % → 23.97 %) and variation. The site-representative retraining had the highest overall accuracy. Accuracy at out-of-sample sites remained comparable to the full dataset, though minor performance declines were observed. We found that specifying an AI's training to single-species classification and conditions within specific environments produced robust classification accuracy at minimal data requirements. By narrowing objectives while ensuring adequate training data variety, we achieved 89.33 % accuracy with a small fraction of the training data required by similar performing models.
期刊介绍:
The journal Ecological Informatics is devoted to the publication of high quality, peer-reviewed articles on all aspects of computational ecology, data science and biogeography. The scope of the journal takes into account the data-intensive nature of ecology, the growing capacity of information technology to access, harness and leverage complex data as well as the critical need for informing sustainable management in view of global environmental and climate change.
The nature of the journal is interdisciplinary at the crossover between ecology and informatics. It focuses on novel concepts and techniques for image- and genome-based monitoring and interpretation, sensor- and multimedia-based data acquisition, internet-based data archiving and sharing, data assimilation, modelling and prediction of ecological data.