Correcting environmental sampling bias improves transferability of species distribution models

IF 5.4 1区环境科学与生态学 Q1 BIODIVERSITY CONSERVATION

Ecography Pub Date : 2025-07-10 DOI:10.1002/ecog.08002

Arman Pili, Boris Leroy, Damaris Zurell

{"title":"Correcting environmental sampling bias improves transferability of species distribution models","authors":"Arman Pili, Boris Leroy, Damaris Zurell","doi":"10.1002/ecog.08002","DOIUrl":null,"url":null,"abstract":"Sampling bias is an inherent problem in widely available biodiversity data, undermining the robustness of correlative species distribution models (SDMs). To some extent, subsampling occurrence data can account for uneven sampling efforts; yet, conventional approaches subsample in geographical space, while subsampling in environmental space remains underexplored. Here, we compared the effectiveness of subsampling methods that correct sampling bias either in geographical space (spatial gridding, spatial distance thinning) or directly in environmental space (environmental gridding), including two novel approaches introduced here: environmental clustering and environmental distance thinning. We hypothesised that environmental subsampling methods would be more effective in improving SDM performance across its three primary uses: explaining, predicting, and projecting. Using a virtual ecologist framework, we assessed SDM performance against four evaluation tests: replicating true species–environment response curves, predicting within the sampling region via internal cross‐validation and evaluation against independent data, and projecting outside the sampling region. Our findings demonstrate that environmental subsampling methods, especially environmental clustering and environmental distance thinning, outperformed other methods in yielding robust SDMs in almost all evaluation tests. Interestingly, cross‐validation favoured SDMs with no sampling bias correction, highlighting the inability of cross‐validation to identify unbiased models. Our findings emphasise a critical conceptual disconnect: SDMs appearing to perform well in predicting species' distributions may not reliably estimate species–environment relationships, nor transfer predictions onto novel environments. Environmental subsampling methods are reliable approaches for all uses, but are particularly suited for explaining species' niches and transferring predictions across space and/or time, such as when anticipating species' responses to climate change or assessing the risk of biological invasions. Conversely, geographic subsampling methods may suffice for predicting species' distributions within their current environmental context, as required in conservation planning. Our study firmly establishes the critical importance of correcting environmental sampling bias, while also providing reliable solutions for supporting biodiversity conservation in an ever‐changing world.","PeriodicalId":51026,"journal":{"name":"Ecography","volume":"147 1","pages":""},"PeriodicalIF":5.4000,"publicationDate":"2025-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ecography","FirstCategoryId":"93","ListUrlMain":"https://doi.org/10.1002/ecog.08002","RegionNum":1,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIODIVERSITY CONSERVATION","Score":null,"Total":0}

引用次数: 0

Abstract

Sampling bias is an inherent problem in widely available biodiversity data, undermining the robustness of correlative species distribution models (SDMs). To some extent, subsampling occurrence data can account for uneven sampling efforts; yet, conventional approaches subsample in geographical space, while subsampling in environmental space remains underexplored. Here, we compared the effectiveness of subsampling methods that correct sampling bias either in geographical space (spatial gridding, spatial distance thinning) or directly in environmental space (environmental gridding), including two novel approaches introduced here: environmental clustering and environmental distance thinning. We hypothesised that environmental subsampling methods would be more effective in improving SDM performance across its three primary uses: explaining, predicting, and projecting. Using a virtual ecologist framework, we assessed SDM performance against four evaluation tests: replicating true species–environment response curves, predicting within the sampling region via internal cross‐validation and evaluation against independent data, and projecting outside the sampling region. Our findings demonstrate that environmental subsampling methods, especially environmental clustering and environmental distance thinning, outperformed other methods in yielding robust SDMs in almost all evaluation tests. Interestingly, cross‐validation favoured SDMs with no sampling bias correction, highlighting the inability of cross‐validation to identify unbiased models. Our findings emphasise a critical conceptual disconnect: SDMs appearing to perform well in predicting species' distributions may not reliably estimate species–environment relationships, nor transfer predictions onto novel environments. Environmental subsampling methods are reliable approaches for all uses, but are particularly suited for explaining species' niches and transferring predictions across space and/or time, such as when anticipating species' responses to climate change or assessing the risk of biological invasions. Conversely, geographic subsampling methods may suffice for predicting species' distributions within their current environmental context, as required in conservation planning. Our study firmly establishes the critical importance of correcting environmental sampling bias, while also providing reliable solutions for supporting biodiversity conservation in an ever‐changing world.

查看原文本刊更多论文

纠正环境采样偏差可以提高物种分布模型的可转移性

抽样偏差是广泛可用的生物多样性数据中固有的问题，它破坏了相关物种分布模型（SDMs）的鲁棒性。在一定程度上，次抽样产率数据可以解释不均匀的抽样努力；然而，传统的方法在地理空间进行亚采样，而在环境空间进行亚采样仍未得到充分的探索。在这里，我们比较了在地理空间（空间网格化，空间距离细化）或直接在环境空间（环境网格化）中纠正采样偏差的子采样方法的有效性，包括本文引入的两种新方法：环境聚类和环境距离细化。我们假设，环境子抽样方法将更有效地改善SDM的三个主要用途：解释、预测和预测。利用虚拟生态学家框架，我们通过四个评估测试来评估SDM的性能：复制真实的物种-环境响应曲线，通过内部交叉验证和独立数据评估在采样区域内进行预测，以及在采样区域外进行预测。我们的研究结果表明，在几乎所有的评估测试中，环境亚采样方法，特别是环境聚类和环境距离细化，在产生稳健的sdm方面优于其他方法。有趣的是，交叉验证倾向于没有抽样偏差校正的sdm，这突出了交叉验证无法识别无偏模型。我们的发现强调了一个关键的概念脱节：sdm在预测物种分布方面表现良好，但可能无法可靠地估计物种与环境的关系，也无法将预测转移到新的环境中。环境亚采样方法对所有用途都是可靠的方法，但特别适用于解释物种的生态位和跨空间和/或时间转移预测，例如预测物种对气候变化的反应或评估生物入侵的风险。相反，地理亚采样方法可能足以预测物种在其当前环境背景下的分布，如保护规划所要求的那样。我们的研究坚定地确立了纠正环境采样偏差的重要性，同时也为在不断变化的世界中支持生物多样性保护提供了可靠的解决方案。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Ecography 环境科学-生态学

CiteScore

11.60

自引率

3.40%

发文量

122

审稿时长

8-16 weeks

期刊介绍： ECOGRAPHY publishes exciting, novel, and important articles that significantly advance understanding of ecological or biodiversity patterns in space or time. Papers focusing on conservation or restoration are welcomed, provided they are anchored in ecological theory and convey a general message that goes beyond a single case study. We encourage papers that seek advancing the field through the development and testing of theory or methodology, or by proposing new tools for analysis or interpretation of ecological phenomena. Manuscripts are expected to address general principles in ecology, though they may do so using a specific model system if they adequately frame the problem relative to a generalized ecological question or problem. Purely descriptive papers are considered only if breaking new ground and/or describing patterns seldom explored. Studies focused on a single species or single location are generally discouraged unless they make a significant contribution to advancing general theory or understanding of biodiversity patterns and processes. Manuscripts merely confirming or marginally extending results of previous work are unlikely to be considered in Ecography. Papers are judged by virtue of their originality, appeal to general interest, and their contribution to new developments in studies of spatial and temporal ecological patterns. There are no biases with regard to taxon, biome, or biogeographical area.