Building Software for Hierarchical Events in Biodiversity Informatics

Biodiversity Information Science and Standards Pub Date : 2023-08-29 DOI:10.3897/biss.7.111770

P. Newman, David Martin, J. Molina

{"title":"Building Software for Hierarchical Events in Biodiversity Informatics","authors":"P. Newman, David Martin, J. Molina","doi":"10.3897/biss.7.111770","DOIUrl":null,"url":null,"abstract":"In 2019, the Atlas of Living Australia (ALA) ran a national consultation, clarifying a long-held suspicion that while simple occurrence records provide invaluable discoverability and analysis for biodiversity data, the lack of contextual information on data collection methodology and protocols limits its usefulness for species abundance estimation and time-series analysis. The consultation recognised that the ALA has strong leadership in biodiversity standards and development, and that our 12-year history and investment in projects and engagement demonstrates a clear capacity to transition to a repository capable of capturing and aggregating the monitoring and survey data required for conservation efforts (Daly 2019). \n Around the same time, the larger data landscape was undergoing change in a similar direction, both internationally through the Global Biodiversity Information Facility’s (GBIF) Unified Model engagements, and nationally through the development of the Australian Biodiversity Information Standard (ABIS), an ontology for describing environmental data (Anonymous 2021). We embarked on a project to examine existing data standards and practices, extend our own occurrence model, and build software that could ingest event-based datasets and make them discoverable and interoperable.\n Initially we focused on well-structured surveys, both marine and terrestrial, to develop the system and user interface (UI). During the project, we restructured and modeled other exemplar datasets, collaborating with GBIF to develop event terms, vocabularies, and user interface components. Seeking interoperability with existing standards, we integrated concepts from both ABIS and the Ocean Biodiversity Information System’s (OBIS) ENV-DATA model (De Pooter et al. 2017) into a standardised yet flexible implementation of Event Core, navigable via a friendly user interface. \n The initial software release is comprised of an ingestion pipeline for events in parallel to occurrences, an index capable of handling nested data structures, and a user interface. The UI guides the user to explore and filter datasets; includes visualisations for data structures, taxonomic scope, repeat location surveys, extended measurements or facts; and links out to child occurrence records. Users can download filtered original and interpreted datasets with Digital Object Identifiers (DOI), in compressed files that comply simultaneously with Darwin Core Archive and Frictionless Data Package specifications.\n On release, we will present a range of datasets covering different event-based scenarios. The model has serendipitously provided the flexibility to encapsulate complex seed bank data. During the project, we developed a draft extension, which we used to service a new data portal for the Australian Seed Bank Partnership, a testament to the model’s serviceability for novel use cases. \n The ALA has taken innovative steps beyond simple collection of complex data types and worked with our local biodiversity informatics community to provide a navigable interface to this data. We intend to continue working with our own data providers and the international community, to realise the benefits of a more complex data model.","PeriodicalId":9011,"journal":{"name":"Biodiversity Information Science and Standards","volume":"6 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biodiversity Information Science and Standards","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3897/biss.7.111770","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

In 2019, the Atlas of Living Australia (ALA) ran a national consultation, clarifying a long-held suspicion that while simple occurrence records provide invaluable discoverability and analysis for biodiversity data, the lack of contextual information on data collection methodology and protocols limits its usefulness for species abundance estimation and time-series analysis. The consultation recognised that the ALA has strong leadership in biodiversity standards and development, and that our 12-year history and investment in projects and engagement demonstrates a clear capacity to transition to a repository capable of capturing and aggregating the monitoring and survey data required for conservation efforts (Daly 2019). Around the same time, the larger data landscape was undergoing change in a similar direction, both internationally through the Global Biodiversity Information Facility’s (GBIF) Unified Model engagements, and nationally through the development of the Australian Biodiversity Information Standard (ABIS), an ontology for describing environmental data (Anonymous 2021). We embarked on a project to examine existing data standards and practices, extend our own occurrence model, and build software that could ingest event-based datasets and make them discoverable and interoperable. Initially we focused on well-structured surveys, both marine and terrestrial, to develop the system and user interface (UI). During the project, we restructured and modeled other exemplar datasets, collaborating with GBIF to develop event terms, vocabularies, and user interface components. Seeking interoperability with existing standards, we integrated concepts from both ABIS and the Ocean Biodiversity Information System’s (OBIS) ENV-DATA model (De Pooter et al. 2017) into a standardised yet flexible implementation of Event Core, navigable via a friendly user interface. The initial software release is comprised of an ingestion pipeline for events in parallel to occurrences, an index capable of handling nested data structures, and a user interface. The UI guides the user to explore and filter datasets; includes visualisations for data structures, taxonomic scope, repeat location surveys, extended measurements or facts; and links out to child occurrence records. Users can download filtered original and interpreted datasets with Digital Object Identifiers (DOI), in compressed files that comply simultaneously with Darwin Core Archive and Frictionless Data Package specifications. On release, we will present a range of datasets covering different event-based scenarios. The model has serendipitously provided the flexibility to encapsulate complex seed bank data. During the project, we developed a draft extension, which we used to service a new data portal for the Australian Seed Bank Partnership, a testament to the model’s serviceability for novel use cases. The ALA has taken innovative steps beyond simple collection of complex data types and worked with our local biodiversity informatics community to provide a navigable interface to this data. We intend to continue working with our own data providers and the international community, to realise the benefits of a more complex data model.

查看原文本刊更多论文

生物多样性信息学中的分层事件构建软件

2019年，澳大利亚生活地图集(ALA)进行了一次全国咨询，澄清了长期以来的怀疑，即虽然简单的事件记录为生物多样性数据提供了宝贵的可发现性和分析，但缺乏数据收集方法和协议的背景信息限制了其对物种丰度估计和时间序列分析的有用性。磋商会认识到，ALA在生物多样性标准和发展方面具有强大的领导作用，我们12年的历史以及在项目和参与方面的投资表明，我们有明显的能力向能够捕获和汇总保护工作所需的监测和调查数据的存储库过渡(Daly 2019)。大约在同一时间，更大的数据格局也在朝着类似的方向发生变化，国际上通过全球生物多样性信息设施(GBIF)统一模型的参与，以及国内通过澳大利亚生物多样性信息标准(ABIS)的发展，这是一个描述环境数据的本体(匿名2021)。我们开始了一个项目，以检查现有的数据标准和实践，扩展我们自己的发生模型，并构建能够摄取基于事件的数据集并使其可发现和可互操作的软件。最初，我们专注于结构良好的海洋和陆地调查，以开发系统和用户界面(UI)。在项目期间，我们对其他范例数据集进行了重构和建模，并与GBIF合作开发事件术语、词汇表和用户界面组件。为了寻求与现有标准的互操作性，我们将ABIS和海洋生物多样性信息系统(OBIS) ENV-DATA模型(De Pooter et al. 2017)的概念整合到Event Core的标准化但灵活的实现中，通过友好的用户界面进行导航。最初的软件版本由一个与事件发生并行的事件摄取管道、一个能够处理嵌套数据结构的索引和一个用户界面组成。UI引导用户探索和过滤数据集;包括数据结构、分类范围、重复位置调查、扩展测量或事实的可视化;并链接到儿童事故记录。用户可以下载过滤原始和解释数据集与数字对象标识符(DOI)，压缩文件，同时符合达尔文核心档案和无摩擦数据包规范。在发布时，我们将提供一系列数据集，涵盖不同的基于事件的场景。该模型意外地提供了封装复杂种子库数据的灵活性。在项目期间，我们开发了一个扩展草案，用于为澳大利亚种子银行合作伙伴关系提供新的数据门户，这证明了该模型对新用例的可服务性。美国生物多样性协会采取了创新的步骤，不仅仅是简单地收集复杂的数据类型，而是与我们当地的生物多样性信息学社区合作，为这些数据提供可导航的界面。我们打算继续与我们自己的数据提供商和国际社会合作，以实现更复杂的数据模型的好处。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Biodiversity Information Science and Standards

自引率

0.00%

发文量