Random forest regression models in ecology: Accounting for messy biological data and producing predictions with uncertainty

IF 2.2 2区 农林科学 Q2 FISHERIES
Caitlin I. Allen Akselrud
{"title":"Random forest regression models in ecology: Accounting for messy biological data and producing predictions with uncertainty","authors":"Caitlin I. Allen Akselrud","doi":"10.1016/j.fishres.2024.107161","DOIUrl":null,"url":null,"abstract":"<div><p>Machine learning methods such as random forest regression models are useful tools in ecology when applied correctly, although features inherent to ecological data sets can lead to over-fitting or uncertain predictions. Here, a set of methods are outlined to account for temporal autocorrelation, and sparse, short, or missing data for random forest predictions. Methods are also provided for estimating prediction uncertainty due to the combination of inherent randomness in the random forest algorithm and sparse input data. This suite of methods was used to generate pre-season predictions of total catches with uncertainty for California market squid (<em>Doryteuthis opalescens</em>), the most valuable fishery in California (by ex-vessel value). The methodology presented in this analysis is not only robust, incorporating key cross-validation and hyperparameter tuning techniques from across disciplines, but is also flexible, making it applicable to various ecological and fisheries datasets beyond market squid.</p></div>","PeriodicalId":50443,"journal":{"name":"Fisheries Research","volume":"280 ","pages":"Article 107161"},"PeriodicalIF":2.2000,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Fisheries Research","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S016578362400225X","RegionNum":2,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"FISHERIES","Score":null,"Total":0}
引用次数: 0

Abstract

Machine learning methods such as random forest regression models are useful tools in ecology when applied correctly, although features inherent to ecological data sets can lead to over-fitting or uncertain predictions. Here, a set of methods are outlined to account for temporal autocorrelation, and sparse, short, or missing data for random forest predictions. Methods are also provided for estimating prediction uncertainty due to the combination of inherent randomness in the random forest algorithm and sparse input data. This suite of methods was used to generate pre-season predictions of total catches with uncertainty for California market squid (Doryteuthis opalescens), the most valuable fishery in California (by ex-vessel value). The methodology presented in this analysis is not only robust, incorporating key cross-validation and hyperparameter tuning techniques from across disciplines, but is also flexible, making it applicable to various ecological and fisheries datasets beyond market squid.

生态学中的随机森林回归模型:考虑杂乱的生物数据并做出具有不确定性的预测
尽管生态数据集的固有特征可能导致过度拟合或不确定的预测,但随机森林回归模型等机器学习方法如果应用得当,仍是生态学中的有用工具。本文概述了一套方法,用于考虑随机森林预测的时间自相关性以及稀疏、短小或缺失数据。此外,还提供了一些方法,用于估算随机森林算法中固有的随机性与稀疏输入数据相结合所导致的预测不确定性。这套方法用于对加州市场鱿鱼(Doryteuthis opalescens)总产量的不确定性进行季前预测,鱿鱼是加州最有价值的渔业(按出船价值计算)。本分析中介绍的方法不仅稳健,采用了跨学科的关键交叉验证和超参数调整技术,而且灵活,适用于市场鱿鱼以外的各种生态和渔业数据集。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Fisheries Research
Fisheries Research 农林科学-渔业
CiteScore
4.50
自引率
16.70%
发文量
294
审稿时长
15 weeks
期刊介绍: This journal provides an international forum for the publication of papers in the areas of fisheries science, fishing technology, fisheries management and relevant socio-economics. The scope covers fisheries in salt, brackish and freshwater systems, and all aspects of associated ecology, environmental aspects of fisheries, and economics. Both theoretical and practical papers are acceptable, including laboratory and field experimental studies relevant to fisheries. Papers on the conservation of exploitable living resources are welcome. Review and Viewpoint articles are also published. As the specified areas inevitably impinge on and interrelate with each other, the approach of the journal is multidisciplinary, and authors are encouraged to emphasise the relevance of their own work to that of other disciplines. The journal is intended for fisheries scientists, biological oceanographers, gear technologists, economists, managers, administrators, policy makers and legislators.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信