Biodiversity data standards for the organization and dissemination of complex research projects and digital twins: a guide

arXiv - QuanBio - Other Quantitative Biology Pub Date : 2024-05-30 DOI:arxiv-2405.19857

Carrie Andrew, Sharif Islam, Claus Weiland, Dag Endresen

{"title":"Biodiversity data standards for the organization and dissemination of complex research projects and digital twins: a guide","authors":"Carrie Andrew, Sharif Islam, Claus Weiland, Dag Endresen","doi":"arxiv-2405.19857","DOIUrl":null,"url":null,"abstract":"Biodiversity data are substantially increasing, spurred by technological\nadvances and community (citizen) science initiatives. To integrate data is,\nlikewise, becoming more commonplace. Open science promotes open sharing and\ndata usage. Data standardization is an instrument for the organization and\nintegration of biodiversity data, which is required for complex research\nprojects and digital twins. However, just like with an actual instrument, there\nis a learning curve to understanding the data standards field. Here we provide\na guide, for data providers and data users, on the logistics of compiling and\nutilizing biodiversity data. We emphasize data standards, because they are\nintegral to data integration. Three primary avenues for compiling biodiversity\ndata are compared, explaining the importance of research infrastructures for\ncoordinated long-term data aggregation. We exemplify the Biodiversity Digital\nTwin (BioDT) as a case study. Four approaches to data standardization are\npresented in terms of the balance between practical constraints and the\nadvancement of the data standards field. We aim for this paper to guide and\nraise awareness of the existing issues related to data standardization, and\nespecially how data standards are key to data interoperability, i.e., machine\naccessibility. The future is promising for computational biodiversity\nadvancements, such as with the BioDT project, but it rests upon the shoulders\nof machine actionability and readability, and that requires data standards for\ncomputational communication.","PeriodicalId":501219,"journal":{"name":"arXiv - QuanBio - Other Quantitative Biology","volume":"88 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuanBio - Other Quantitative Biology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2405.19857","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Biodiversity data are substantially increasing, spurred by technological advances and community (citizen) science initiatives. To integrate data is, likewise, becoming more commonplace. Open science promotes open sharing and data usage. Data standardization is an instrument for the organization and integration of biodiversity data, which is required for complex research projects and digital twins. However, just like with an actual instrument, there is a learning curve to understanding the data standards field. Here we provide a guide, for data providers and data users, on the logistics of compiling and utilizing biodiversity data. We emphasize data standards, because they are integral to data integration. Three primary avenues for compiling biodiversity data are compared, explaining the importance of research infrastructures for coordinated long-term data aggregation. We exemplify the Biodiversity Digital Twin (BioDT) as a case study. Four approaches to data standardization are presented in terms of the balance between practical constraints and the advancement of the data standards field. We aim for this paper to guide and raise awareness of the existing issues related to data standardization, and especially how data standards are key to data interoperability, i.e., machine accessibility. The future is promising for computational biodiversity advancements, such as with the BioDT project, but it rests upon the shoulders of machine actionability and readability, and that requires data standards for computational communication.

查看原文本刊更多论文

用于组织和传播复杂研究项目和数字双胞胎的生物多样性数据标准：指南

在技术进步和社区（公民）科学倡议的推动下，生物多样性数据大幅增加。同样，整合数据也变得越来越普遍。开放科学促进开放共享和数据使用。数据标准化是组织和整合生物多样性数据的工具，而这正是复杂的研究项目和数字双胞胎所需要的。然而，就像实际工具一样，了解数据标准领域也有一个学习曲线。在此，我们将为数据提供者和数据使用者提供一份指南，指导他们如何编纂和利用生物多样性数据。我们强调数据标准，因为数据标准与数据整合密不可分。我们比较了汇编生物多样性数据的三种主要途径，说明了研究基础设施对于协调长期数据汇总的重要性。我们以生物多样性数字孪生系统（BioDT）为例进行了研究。本文从实际限制与数据标准领域发展之间的平衡角度，介绍了数据标准化的四种方法。我们希望通过本文引导人们关注与数据标准化相关的现有问题，尤其是数据标准如何成为数据互操作性（即机器可访问性）的关键。生物多样性计算技术的发展前景广阔，如 BioDT 项目，但这取决于机器的可操作性和可读性，而这需要计算交流的数据标准。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

arXiv - QuanBio - Other Quantitative Biology

自引率

0.00%

发文量