为基于机器学习的鱼类分布建模确定最佳变量

IF 1.9 2区 农林科学 Q2 FISHERIES
Shaohua Xu, Jintao Wang, Xinjun Chen, Jiangfeng Zhu
{"title":"为基于机器学习的鱼类分布建模确定最佳变量","authors":"Shaohua Xu, Jintao Wang, Xinjun Chen, Jiangfeng Zhu","doi":"10.1139/cjfas-2023-0197","DOIUrl":null,"url":null,"abstract":"Canadian Journal of Fisheries and Aquatic Sciences, Ahead of Print. <br/> Machine learning occupies a central position in the modeling of fish distribution patterns. The augmentation of explanatory variables in fish habitat through many kinds of observational methodologies necessitates the discernment of an optimal combination of these variables for fish distribution modeling. We proposed a feature selection technique, recursive feature elimination with cross-validation (RFECV), to determine optimal variables combinations for yellowfin tuna distribution in the Pacific Ocean. Four tree-based models, random forest, eXtreme Gradient Boosting, Light Gradient Boosting Machine, and categorical boosting driven by RFECV, were developed using comprehensive fisheries and biotic/abiotic data. Habitat variables including sea temperature, dissolved oxygen concentration, chlorophyll-a concentration, sea salinity, and sea surface height were identified as significant features by all models. The models were trained using the corresponding selected variables, and these trained models were employed to predict the spatiotemporal distribution of yellowfin tuna from 1995 to 2019. The results obtained could inform useful knowledge for the sustainable exploitation of yellowfin tuna in the Pacific Ocean and furnish a benchmark of feature selection for machine-learning-based distribution modeling of other pelagic species.","PeriodicalId":9515,"journal":{"name":"Canadian Journal of Fisheries and Aquatic Sciences","volume":"73 1","pages":""},"PeriodicalIF":1.9000,"publicationDate":"2024-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Identifying optimal variables for machine-learning-based fish distribution modeling\",\"authors\":\"Shaohua Xu, Jintao Wang, Xinjun Chen, Jiangfeng Zhu\",\"doi\":\"10.1139/cjfas-2023-0197\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Canadian Journal of Fisheries and Aquatic Sciences, Ahead of Print. <br/> Machine learning occupies a central position in the modeling of fish distribution patterns. The augmentation of explanatory variables in fish habitat through many kinds of observational methodologies necessitates the discernment of an optimal combination of these variables for fish distribution modeling. We proposed a feature selection technique, recursive feature elimination with cross-validation (RFECV), to determine optimal variables combinations for yellowfin tuna distribution in the Pacific Ocean. Four tree-based models, random forest, eXtreme Gradient Boosting, Light Gradient Boosting Machine, and categorical boosting driven by RFECV, were developed using comprehensive fisheries and biotic/abiotic data. Habitat variables including sea temperature, dissolved oxygen concentration, chlorophyll-a concentration, sea salinity, and sea surface height were identified as significant features by all models. The models were trained using the corresponding selected variables, and these trained models were employed to predict the spatiotemporal distribution of yellowfin tuna from 1995 to 2019. The results obtained could inform useful knowledge for the sustainable exploitation of yellowfin tuna in the Pacific Ocean and furnish a benchmark of feature selection for machine-learning-based distribution modeling of other pelagic species.\",\"PeriodicalId\":9515,\"journal\":{\"name\":\"Canadian Journal of Fisheries and Aquatic Sciences\",\"volume\":\"73 1\",\"pages\":\"\"},\"PeriodicalIF\":1.9000,\"publicationDate\":\"2024-03-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Canadian Journal of Fisheries and Aquatic Sciences\",\"FirstCategoryId\":\"97\",\"ListUrlMain\":\"https://doi.org/10.1139/cjfas-2023-0197\",\"RegionNum\":2,\"RegionCategory\":\"农林科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"FISHERIES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Canadian Journal of Fisheries and Aquatic Sciences","FirstCategoryId":"97","ListUrlMain":"https://doi.org/10.1139/cjfas-2023-0197","RegionNum":2,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"FISHERIES","Score":null,"Total":0}
引用次数: 0

摘要

加拿大渔业与水产科学杂志》,提前印刷。 机器学习在鱼类分布模式建模中占据核心地位。通过多种观测方法增加鱼类栖息地的解释变量,需要为鱼类分布建模找出这些变量的最佳组合。我们提出了一种特征选择技术--递归特征消除与交叉验证(RFECV),以确定太平洋黄鳍金枪鱼分布的最佳变量组合。利用全面的渔业和生物/非生物数据开发了四种基于树的模型,即随机森林、极梯度提升、光梯度提升机和由 RFECV 驱动的分类提升模型。包括海温、溶解氧浓度、叶绿素-a 浓度、海水盐度和海面高度在内的生境变量被所有模型确定为重要特征。利用相应的选定变量对模型进行训练,并利用这些训练后的模型预测 1995 年至 2019 年黄鳍金枪鱼的时空分布。所获得的结果可为太平洋黄鳍金枪鱼的可持续开发提供有用的知识,并为其他远洋物种基于机器学习的分布建模提供特征选择基准。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Identifying optimal variables for machine-learning-based fish distribution modeling
Canadian Journal of Fisheries and Aquatic Sciences, Ahead of Print.
Machine learning occupies a central position in the modeling of fish distribution patterns. The augmentation of explanatory variables in fish habitat through many kinds of observational methodologies necessitates the discernment of an optimal combination of these variables for fish distribution modeling. We proposed a feature selection technique, recursive feature elimination with cross-validation (RFECV), to determine optimal variables combinations for yellowfin tuna distribution in the Pacific Ocean. Four tree-based models, random forest, eXtreme Gradient Boosting, Light Gradient Boosting Machine, and categorical boosting driven by RFECV, were developed using comprehensive fisheries and biotic/abiotic data. Habitat variables including sea temperature, dissolved oxygen concentration, chlorophyll-a concentration, sea salinity, and sea surface height were identified as significant features by all models. The models were trained using the corresponding selected variables, and these trained models were employed to predict the spatiotemporal distribution of yellowfin tuna from 1995 to 2019. The results obtained could inform useful knowledge for the sustainable exploitation of yellowfin tuna in the Pacific Ocean and furnish a benchmark of feature selection for machine-learning-based distribution modeling of other pelagic species.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Canadian Journal of Fisheries and Aquatic Sciences
Canadian Journal of Fisheries and Aquatic Sciences 农林科学-海洋与淡水生物学
CiteScore
4.60
自引率
12.50%
发文量
148
审稿时长
6-16 weeks
期刊介绍: The Canadian Journal of Fisheries and Aquatic Sciences is the primary publishing vehicle for the multidisciplinary field of aquatic sciences. It publishes perspectives (syntheses, critiques, and re-evaluations), discussions (comments and replies), articles, and rapid communications, relating to current research on -omics, cells, organisms, populations, ecosystems, or processes that affect aquatic systems. The journal seeks to amplify, modify, question, or redirect accumulated knowledge in the field of fisheries and aquatic science.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信