Robert A. Moore , Matthew R.E. Symonds , Scarlett R. Howard
{"title":"Leveraging social media and community science data for environmental niche models: A case study with native Australian bees","authors":"Robert A. Moore , Matthew R.E. Symonds , Scarlett R. Howard","doi":"10.1016/j.ecoinf.2024.102857","DOIUrl":null,"url":null,"abstract":"<div><div>Museum occurrence records are popular sources of information for creating Environmental Niche Models (ENMs), which allow the mapping of the potential niche ranges of species. Occurrence data is often downloaded <em>en masse</em> from established databases. However, the use of non-traditional data sources, such as occurrence records from community/citizen science outreach and social media, is increasing in use and abundance. Data from non-traditional data sources are potentially valuable records of information, particularly for species where museum occurrence records may be comparatively scarce. In the current study, we aimed to determine the impact of adding occurrence data from non-traditional databases to ENMs that were originally created using traditional databases with a group of comparatively understudied species, native Australian bees. We used the Maxent algorithm to model the potential environmental niches of eight species. We created three models for each species: 1) one consisting of only location data from museum specimen collection records from the Atlas of Living Australia (ALA) (a traditional database), 2) one combining ALA and geo-tagged social media (Flickr) data, and 3) a model combining ALA and geo-tagged community science data from iNaturalist. This resulted in 24 different models. By comparing the models produced from each of the augmented data sets with the traditional species data set (ALA vs. ALA & Flickr; ALA vs. ALA & iNaturalist) we showed that there were significant differences, not only in predicted ranges, but also in the weighting of environmental variables used by the models to predict the environmental niche. Differences were more greatly influenced by the geographic location of the extra occurrences rather than the number of additional occurrence points. We demonstrate the potential value and risks of including social media and community science geo-tagged image data in supplementing knowledge of species distributions, particularly for relatively under-sampled species such as native bees.</div></div>","PeriodicalId":51024,"journal":{"name":"Ecological Informatics","volume":"84 ","pages":"Article 102857"},"PeriodicalIF":5.8000,"publicationDate":"2024-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ecological Informatics","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1574954124003996","RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ECOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Museum occurrence records are popular sources of information for creating Environmental Niche Models (ENMs), which allow the mapping of the potential niche ranges of species. Occurrence data is often downloaded en masse from established databases. However, the use of non-traditional data sources, such as occurrence records from community/citizen science outreach and social media, is increasing in use and abundance. Data from non-traditional data sources are potentially valuable records of information, particularly for species where museum occurrence records may be comparatively scarce. In the current study, we aimed to determine the impact of adding occurrence data from non-traditional databases to ENMs that were originally created using traditional databases with a group of comparatively understudied species, native Australian bees. We used the Maxent algorithm to model the potential environmental niches of eight species. We created three models for each species: 1) one consisting of only location data from museum specimen collection records from the Atlas of Living Australia (ALA) (a traditional database), 2) one combining ALA and geo-tagged social media (Flickr) data, and 3) a model combining ALA and geo-tagged community science data from iNaturalist. This resulted in 24 different models. By comparing the models produced from each of the augmented data sets with the traditional species data set (ALA vs. ALA & Flickr; ALA vs. ALA & iNaturalist) we showed that there were significant differences, not only in predicted ranges, but also in the weighting of environmental variables used by the models to predict the environmental niche. Differences were more greatly influenced by the geographic location of the extra occurrences rather than the number of additional occurrence points. We demonstrate the potential value and risks of including social media and community science geo-tagged image data in supplementing knowledge of species distributions, particularly for relatively under-sampled species such as native bees.
博物馆的出现记录是创建环境生态位模型(ENM)的常用信息来源,该模型可以绘制物种的潜在生态位范围。物种出现数据通常是从现有数据库中大量下载的。然而,非传统数据源(如来自社区/公民科学推广和社交媒体的出现记录)的使用和丰富程度也在不断提高。来自非传统数据源的数据可能是有价值的信息记录,特别是对于博物馆出现记录可能相对稀缺的物种。在当前的研究中,我们旨在确定将非传统数据库中的出现数据添加到最初使用传统数据库创建的 ENM 中对一组研究相对不足的物种--澳大利亚本地蜜蜂--的影响。我们使用 Maxent 算法建立了八个物种的潜在环境壁龛模型。我们为每个物种创建了三个模型:1)一个模型仅包含来自《澳大利亚生物地图集》(ALA)(传统数据库)中博物馆标本采集记录的位置数据;2)一个模型结合了 ALA 和带有地理标记的社交媒体(Flickr)数据;3)一个模型结合了 ALA 和来自 iNaturalist 的带有地理标记的社区科学数据。这样就产生了 24 个不同的模型。通过比较每个增强数据集和传统物种数据集(ALA vs. ALA & Flickr; ALA vs. ALA & iNaturalist)产生的模型,我们发现不仅在预测范围上存在显著差异,而且在模型用于预测环境生态位的环境变量权重上也存在显著差异。额外出现点的地理位置比额外出现点的数量对差异的影响更大。我们证明了社交媒体和社区科学地理标记图像数据在补充物种分布知识方面的潜在价值和风险,特别是对于本土蜜蜂等采样相对不足的物种。
期刊介绍:
The journal Ecological Informatics is devoted to the publication of high quality, peer-reviewed articles on all aspects of computational ecology, data science and biogeography. The scope of the journal takes into account the data-intensive nature of ecology, the growing capacity of information technology to access, harness and leverage complex data as well as the critical need for informing sustainable management in view of global environmental and climate change.
The nature of the journal is interdisciplinary at the crossover between ecology and informatics. It focuses on novel concepts and techniques for image- and genome-based monitoring and interpretation, sensor- and multimedia-based data acquisition, internet-based data archiving and sharing, data assimilation, modelling and prediction of ecological data.