Predictive performance from abundance distribution models of Vinciguerria lucetia larvae in the southern portion of the California current system using XGBOOST
{"title":"Predictive performance from abundance distribution models of Vinciguerria lucetia larvae in the southern portion of the California current system using XGBOOST","authors":"Rubén Esteban García-Gómez , Gerardo Aceves-Medina , Héctor Villalobos , Sylvia Patricia Adelheid Jiménez Rosenberg , Reginaldo Durazo","doi":"10.1016/j.dsr2.2023.105336","DOIUrl":null,"url":null,"abstract":"<div><p><em>Vinciguerria lucetia</em><span><span> is a mesopelagic fish whose larvae show an almost permanent presence in the southern portion of the California Current System. Due to its sensitivity to environmental changes, the species has been considered an indicator of water masses and interannual variability. </span>Fish larvae abundance registered from 1997 to 2015 by the program Investigaciones Mexicanas de la Corriente de California was used to predict the abundance distribution of </span><em>V. lucetia</em><span> larvae under two extreme thermal conditions (2000 La Niña<span> and 2015 El Niño), utilizing the novel machine learning algorithm eXtreme Gradient Boosting (XGBOOST). The data were segmented into COLD and WARM groups based on the mean sea surface temperature recorded at each station and contrasted with an undivided TOTAL group. Models were generated using 12 environmental and biological predictor features. Root-mean-squared logarithm error (RMSLE) was used as a prediction performance metric for both internal and external validation. The COLD model showed the best performance for the internal validation with a lower RMSLE value, while the TOTAL model for both the coldest and warmest external validation presented the lowest RMSLE values. The external validation demonstrated models that accurately predicted the spatial distribution; however, none of the models were able to accurately predict the same abundance magnitude observed in both extreme thermal conditions. Nevertheless, XGBOOST shows promise for describing the future distribution traits of </span></span><em>V. lucetia.</em></p></div>","PeriodicalId":11120,"journal":{"name":"Deep-sea Research Part Ii-topical Studies in Oceanography","volume":"212 ","pages":"Article 105336"},"PeriodicalIF":2.3000,"publicationDate":"2023-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Deep-sea Research Part Ii-topical Studies in Oceanography","FirstCategoryId":"89","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0967064523000863","RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"OCEANOGRAPHY","Score":null,"Total":0}
引用次数: 0
Abstract
Vinciguerria lucetia is a mesopelagic fish whose larvae show an almost permanent presence in the southern portion of the California Current System. Due to its sensitivity to environmental changes, the species has been considered an indicator of water masses and interannual variability. Fish larvae abundance registered from 1997 to 2015 by the program Investigaciones Mexicanas de la Corriente de California was used to predict the abundance distribution of V. lucetia larvae under two extreme thermal conditions (2000 La Niña and 2015 El Niño), utilizing the novel machine learning algorithm eXtreme Gradient Boosting (XGBOOST). The data were segmented into COLD and WARM groups based on the mean sea surface temperature recorded at each station and contrasted with an undivided TOTAL group. Models were generated using 12 environmental and biological predictor features. Root-mean-squared logarithm error (RMSLE) was used as a prediction performance metric for both internal and external validation. The COLD model showed the best performance for the internal validation with a lower RMSLE value, while the TOTAL model for both the coldest and warmest external validation presented the lowest RMSLE values. The external validation demonstrated models that accurately predicted the spatial distribution; however, none of the models were able to accurately predict the same abundance magnitude observed in both extreme thermal conditions. Nevertheless, XGBOOST shows promise for describing the future distribution traits of V. lucetia.
lucetia Vinciguerria是一种中上层鱼类,其幼虫几乎永久存在于加利福尼亚洋流系统的南部。由于其对环境变化的敏感性,该物种被认为是水体和年际变化的指标。利用新的机器学习算法极限梯度提升(XGBOOST),利用加利福尼亚州墨西哥科学研究计划(Investigaciones Mexicanas de la Corriente de California)在1997年至2015年登记的鱼类幼虫丰度,预测了两种极端热条件(2000年拉尼娜和2015年厄尔尼诺)下的lucetia幼虫丰度分布。根据每个站记录的平均海面温度,将数据分为COLD组和WARM组,并与未划分的TOTAL组进行对比。使用12个环境和生物预测特征生成模型。均方根对数误差(RMSLE)被用作内部和外部验证的预测性能指标。COLD模型在内部验证中表现出最好的性能,RMSLE值较低,而在最冷和最热的外部验证中,TOTAL模型的RMSLE值最低。外部验证证明了准确预测空间分布的模型;然而,没有一个模型能够准确预测在两种极端热条件下观测到的相同丰度大小。然而,XGBOOST在描述苜蓿未来的分布特征方面显示出了前景。
期刊介绍:
Deep-Sea Research Part II: Topical Studies in Oceanography publishes topical issues from the many international and interdisciplinary projects which are undertaken in oceanography. Besides these special issues from projects, the journal publishes collections of papers presented at conferences. The special issues regularly have electronic annexes of non-text material (numerical data, images, images, video, etc.) which are published with the special issues in ScienceDirect. Deep-Sea Research Part II was split off as a separate journal devoted to topical issues in 1993. Its companion journal Deep-Sea Research Part I: Oceanographic Research Papers, publishes the regular research papers in this area.