Filling Gaps in Earthworm Digital Diversity in Northern Eurasia from Russian-language Literature

Maxim Shashkov, Natalya Ivanova, Sergey Ermolov
{"title":"Filling Gaps in Earthworm Digital Diversity in Northern Eurasia from Russian-language Literature","authors":"Maxim Shashkov, Natalya Ivanova, Sergey Ermolov","doi":"10.3897/biss.7.112957","DOIUrl":null,"url":null,"abstract":"Data availability for certain groups of organisms (ecosystem engineers, invasive or protected species, etc.) is important for monitoring and making predictions in changing environments. One of the most promising directions for research on the impact of changes is species distribution modelling. Such technologies are highly dependent on occurrence data of high quality (Van Eupen et al. 2021). Earthworms (order Crassiclitellata) are a key group of organisms (Lavelle 2014), but their distribution around the globe is underrepresented in digital resources. Dozens of earthworm species, both widespread and endemic, inhabit the territory of Northern Eurasia (Perel 1979), but extremely poor data on them is available through global biodiversity repositories (Cameron 2018). There are two main obstacles to data mobilisation. Firstly, studies of the diversity of earthworms in Northen Eurasia have a long history (since the end of the nineteenth century) and were conducted by several generations of Soviet and Russian researchers. Most of the collected data have been published in \"grey literature\", now stored only in a few libraries. Until recently, most of these remained largely undigitised, and some are probably irretrievably lost. The second problem is the difference in the taxonomic checklists used by Soviet and European researchers. Not all species and synonyms are included in the GBIF (Global Biodiversity Information Facility) Backbone Taxonomy. As a result, existing earthworm species distribution models (Phillips 2019) potentially miss a significant amount of data and may underestimate biodiversity, and predict distributions inaccurately. To fill this gap, we collected occurrence data from the Russian language literature (published by Soviet and Russian researchers) and digitised species checklists, keeping the original scientific names. To find relevant literature, we conducted a keyword search for \"earthworms\" and \"Lumbricidae\" through the Russian national scientific online library eLibrary and screened reference lists from the monographs of leading Soviet and Russian soil zoologist Tamara Perel (Vsevolodova-Perel 1997, Perel 1979). As a result, about 1,000 references were collected, of which 330 papers had titles indicating the potential to contain data on earthworm occurrences. Among these, 219 were found as PDF files or printed papers. For dataset compilation, 159 papers were used; the others had no exact location data or duplicated data contained in other papers. Most of the sources were peer-reviewed articles (Table 1). A reference list is available through Zenodo (Ivanova et al. 2023). The earliest publication we could find dates back to 1899, by Wilhelm Michaelsen. The most recent publication is 2023. About a third of the sources were written by systematists Iosif Malevich and Tamara Perel. Occurrence data were extracted and structured according to the Darwin Core standard (Wieczorek et al. 2012). During the data digitisation process, we tried to include as much primary information as possible. Only one tenth of the literature occurrences contained the geographic coordinates of locations provided by the authors. The remaining occurrences were manually georeferenced using the point-radius method (Wieczorek et al. 2010). The resulting occurrence dataset Earthworm occurrences from Russian-language literature (Shashkov et al. 2023) was published through the Global Biodiversity Information Facility portal. It contains 5304 occurrences of 117 species from 27 countries (Fig. 1). To improve the GBIF Backbone Taxonomy, we digitised two catalogues of earthworm species published for the USSR (Perel 1979) and Russian Federation (Vsevolodova-Perel 1997) by Tamara Perel. Based on these monographs, three checklist datasets were published through GBIF (Shashkov 2023b, 124 records; Shashkov 2023c, 87 records; Shashkov 2023a, 95 records). Now we work towards including these names in the GBIF Backbone so that all species names can be matched and recorded exactly as mentioned in papers published by Soviet and Russian researchers.","PeriodicalId":9011,"journal":{"name":"Biodiversity Information Science and Standards","volume":"187 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biodiversity Information Science and Standards","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3897/biss.7.112957","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Data availability for certain groups of organisms (ecosystem engineers, invasive or protected species, etc.) is important for monitoring and making predictions in changing environments. One of the most promising directions for research on the impact of changes is species distribution modelling. Such technologies are highly dependent on occurrence data of high quality (Van Eupen et al. 2021). Earthworms (order Crassiclitellata) are a key group of organisms (Lavelle 2014), but their distribution around the globe is underrepresented in digital resources. Dozens of earthworm species, both widespread and endemic, inhabit the territory of Northern Eurasia (Perel 1979), but extremely poor data on them is available through global biodiversity repositories (Cameron 2018). There are two main obstacles to data mobilisation. Firstly, studies of the diversity of earthworms in Northen Eurasia have a long history (since the end of the nineteenth century) and were conducted by several generations of Soviet and Russian researchers. Most of the collected data have been published in "grey literature", now stored only in a few libraries. Until recently, most of these remained largely undigitised, and some are probably irretrievably lost. The second problem is the difference in the taxonomic checklists used by Soviet and European researchers. Not all species and synonyms are included in the GBIF (Global Biodiversity Information Facility) Backbone Taxonomy. As a result, existing earthworm species distribution models (Phillips 2019) potentially miss a significant amount of data and may underestimate biodiversity, and predict distributions inaccurately. To fill this gap, we collected occurrence data from the Russian language literature (published by Soviet and Russian researchers) and digitised species checklists, keeping the original scientific names. To find relevant literature, we conducted a keyword search for "earthworms" and "Lumbricidae" through the Russian national scientific online library eLibrary and screened reference lists from the monographs of leading Soviet and Russian soil zoologist Tamara Perel (Vsevolodova-Perel 1997, Perel 1979). As a result, about 1,000 references were collected, of which 330 papers had titles indicating the potential to contain data on earthworm occurrences. Among these, 219 were found as PDF files or printed papers. For dataset compilation, 159 papers were used; the others had no exact location data or duplicated data contained in other papers. Most of the sources were peer-reviewed articles (Table 1). A reference list is available through Zenodo (Ivanova et al. 2023). The earliest publication we could find dates back to 1899, by Wilhelm Michaelsen. The most recent publication is 2023. About a third of the sources were written by systematists Iosif Malevich and Tamara Perel. Occurrence data were extracted and structured according to the Darwin Core standard (Wieczorek et al. 2012). During the data digitisation process, we tried to include as much primary information as possible. Only one tenth of the literature occurrences contained the geographic coordinates of locations provided by the authors. The remaining occurrences were manually georeferenced using the point-radius method (Wieczorek et al. 2010). The resulting occurrence dataset Earthworm occurrences from Russian-language literature (Shashkov et al. 2023) was published through the Global Biodiversity Information Facility portal. It contains 5304 occurrences of 117 species from 27 countries (Fig. 1). To improve the GBIF Backbone Taxonomy, we digitised two catalogues of earthworm species published for the USSR (Perel 1979) and Russian Federation (Vsevolodova-Perel 1997) by Tamara Perel. Based on these monographs, three checklist datasets were published through GBIF (Shashkov 2023b, 124 records; Shashkov 2023c, 87 records; Shashkov 2023a, 95 records). Now we work towards including these names in the GBIF Backbone so that all species names can be matched and recorded exactly as mentioned in papers published by Soviet and Russian researchers.
从俄语文献填补欧亚大陆北部蚯蚓数字多样性的空白
某些生物群体(生态系统工程师、入侵物种或受保护物种等)的数据可用性对于监测和预测不断变化的环境非常重要。物种分布模型是研究变化影响最有前途的方向之一。这些技术高度依赖于高质量的发生率数据(Van Eupen et al. 2021)。蚯蚓是一个重要的生物类群(Lavelle 2014),但它们在全球的分布在数字资源中代表性不足。数十种广泛和特有的蚯蚓物种栖息在欧亚大陆北部(Perel 1979),但通过全球生物多样性存储库可获得的有关它们的数据极其贫乏(Cameron 2018)。数据移动有两个主要障碍。首先,对欧亚大陆北部蚯蚓多样性的研究有很长的历史(自19世纪末以来),由几代苏联和俄罗斯研究人员进行。大多数收集到的数据都以“灰色文献”的形式发表,现在只存放在少数图书馆中。直到最近,其中大部分仍未被数字化,有些可能已经无可挽回地丢失了。第二个问题是苏联和欧洲研究人员使用的分类清单不同。并非所有的物种和同义词都包含在GBIF(全球生物多样性信息设施)主干分类中。因此,现有的蚯蚓物种分布模型(Phillips 2019)可能会遗漏大量数据,并可能低估生物多样性,并不准确地预测分布。为了填补这一空白,我们从俄语文献(由苏联和俄罗斯研究人员发表)中收集了发生数据,并将物种清单数字化,保留了原始的学名。为了寻找相关文献,我们通过俄罗斯国家科学在线图书馆对“蚯蚓”和“蚓科”进行了关键词搜索,并从苏联和俄罗斯著名土壤动物学家Tamara Perel (Vsevolodova-Perel 1997, Perel 1979)的专著中筛选了参考文献列表。结果,收集了大约1,000篇参考文献,其中330篇论文的标题表明可能包含有关蚯蚓发生的数据。其中219份是PDF文件或纸质文件。数据集的编制使用了159篇论文;其他的没有确切的位置数据或重复的数据包含在其他文件。大多数来源是同行评议的文章(表1)。参考文献列表可通过Zenodo (Ivanova et al. 2023)获得。我们能找到的最早的出版物要追溯到1899年,作者是威廉·迈克尔森。最近的出版物是2023年。大约三分之一的资料是由系统学家Iosif Malevich和Tamara Perel撰写的。根据达尔文核心标准(Wieczorek et al. 2012)提取和构建产率数据。在数据数字化的过程中,我们试图尽可能多地包含原始信息。只有十分之一的文献记载了作者提供的地点的地理坐标。使用点半径法(Wieczorek et al. 2010)手动对剩余的发生点进行地理参考。通过全球生物多样性信息设施门户网站发布了俄语文献中的蚯蚓事件数据集(Shashkov et al. 2023)。它包含来自27个国家的117个物种的5304个事件(图1)。为了改进GBIF主干分类,我们对Tamara Perel出版的苏联(Perel 1979)和俄罗斯联邦(Vsevolodova-Perel 1997)的两份蚯蚓物种目录进行了数字化。基于这些专著,通过GBIF发布了3个检查表数据集(Shashkov 2023b, 124条;Shashkov 2023c, 87记录;Shashkov 2023a, 95记录)。现在,我们正在努力将这些名字纳入GBIF主干,以便所有物种的名字都可以匹配和记录,就像苏联和俄罗斯研究人员发表的论文中提到的那样。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信