{"title":"Correcting environmental sampling bias improves transferability of species distribution models","authors":"Arman Pili, Boris Leroy, Damaris Zurell","doi":"10.1002/ecog.08002","DOIUrl":null,"url":null,"abstract":"Sampling bias is an inherent problem in widely available biodiversity data, undermining the robustness of correlative species distribution models (SDMs). To some extent, subsampling occurrence data can account for uneven sampling efforts; yet, conventional approaches subsample in geographical space, while subsampling in environmental space remains underexplored. Here, we compared the effectiveness of subsampling methods that correct sampling bias either in geographical space (spatial gridding, spatial distance thinning) or directly in environmental space (environmental gridding), including two novel approaches introduced here: environmental clustering and environmental distance thinning. We hypothesised that environmental subsampling methods would be more effective in improving SDM performance across its three primary uses: explaining, predicting, and projecting. Using a virtual ecologist framework, we assessed SDM performance against four evaluation tests: replicating true species–environment response curves, predicting within the sampling region via internal cross‐validation and evaluation against independent data, and projecting outside the sampling region. Our findings demonstrate that environmental subsampling methods, especially environmental clustering and environmental distance thinning, outperformed other methods in yielding robust SDMs in almost all evaluation tests. Interestingly, cross‐validation favoured SDMs with no sampling bias correction, highlighting the inability of cross‐validation to identify unbiased models. Our findings emphasise a critical conceptual disconnect: SDMs appearing to perform well in predicting species' distributions may not reliably estimate species–environment relationships, nor transfer predictions onto novel environments. Environmental subsampling methods are reliable approaches for all uses, but are particularly suited for explaining species' niches and transferring predictions across space and/or time, such as when anticipating species' responses to climate change or assessing the risk of biological invasions. Conversely, geographic subsampling methods may suffice for predicting species' distributions within their current environmental context, as required in conservation planning. Our study firmly establishes the critical importance of correcting environmental sampling bias, while also providing reliable solutions for supporting biodiversity conservation in an ever‐changing world.","PeriodicalId":51026,"journal":{"name":"Ecography","volume":"147 1","pages":""},"PeriodicalIF":5.4000,"publicationDate":"2025-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ecography","FirstCategoryId":"93","ListUrlMain":"https://doi.org/10.1002/ecog.08002","RegionNum":1,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIODIVERSITY CONSERVATION","Score":null,"Total":0}
引用次数: 0
Abstract
Sampling bias is an inherent problem in widely available biodiversity data, undermining the robustness of correlative species distribution models (SDMs). To some extent, subsampling occurrence data can account for uneven sampling efforts; yet, conventional approaches subsample in geographical space, while subsampling in environmental space remains underexplored. Here, we compared the effectiveness of subsampling methods that correct sampling bias either in geographical space (spatial gridding, spatial distance thinning) or directly in environmental space (environmental gridding), including two novel approaches introduced here: environmental clustering and environmental distance thinning. We hypothesised that environmental subsampling methods would be more effective in improving SDM performance across its three primary uses: explaining, predicting, and projecting. Using a virtual ecologist framework, we assessed SDM performance against four evaluation tests: replicating true species–environment response curves, predicting within the sampling region via internal cross‐validation and evaluation against independent data, and projecting outside the sampling region. Our findings demonstrate that environmental subsampling methods, especially environmental clustering and environmental distance thinning, outperformed other methods in yielding robust SDMs in almost all evaluation tests. Interestingly, cross‐validation favoured SDMs with no sampling bias correction, highlighting the inability of cross‐validation to identify unbiased models. Our findings emphasise a critical conceptual disconnect: SDMs appearing to perform well in predicting species' distributions may not reliably estimate species–environment relationships, nor transfer predictions onto novel environments. Environmental subsampling methods are reliable approaches for all uses, but are particularly suited for explaining species' niches and transferring predictions across space and/or time, such as when anticipating species' responses to climate change or assessing the risk of biological invasions. Conversely, geographic subsampling methods may suffice for predicting species' distributions within their current environmental context, as required in conservation planning. Our study firmly establishes the critical importance of correcting environmental sampling bias, while also providing reliable solutions for supporting biodiversity conservation in an ever‐changing world.
期刊介绍:
ECOGRAPHY publishes exciting, novel, and important articles that significantly advance understanding of ecological or biodiversity patterns in space or time. Papers focusing on conservation or restoration are welcomed, provided they are anchored in ecological theory and convey a general message that goes beyond a single case study. We encourage papers that seek advancing the field through the development and testing of theory or methodology, or by proposing new tools for analysis or interpretation of ecological phenomena. Manuscripts are expected to address general principles in ecology, though they may do so using a specific model system if they adequately frame the problem relative to a generalized ecological question or problem.
Purely descriptive papers are considered only if breaking new ground and/or describing patterns seldom explored. Studies focused on a single species or single location are generally discouraged unless they make a significant contribution to advancing general theory or understanding of biodiversity patterns and processes. Manuscripts merely confirming or marginally extending results of previous work are unlikely to be considered in Ecography.
Papers are judged by virtue of their originality, appeal to general interest, and their contribution to new developments in studies of spatial and temporal ecological patterns. There are no biases with regard to taxon, biome, or biogeographical area.