An Australian Model of Cooperative Data Publishing to OBIS and GBIF

Katherine Tattersall, P. Newman, Sachit Rajbhandari, Dave Watts, Mahmoud Sadeghi
{"title":"An Australian Model of Cooperative Data Publishing to OBIS and GBIF","authors":"Katherine Tattersall, P. Newman, Sachit Rajbhandari, Dave Watts, Mahmoud Sadeghi","doi":"10.3897/biss.7.112228","DOIUrl":null,"url":null,"abstract":"The Australian Commonwealth Science and Industrial Research Organisation (CSIRO) hosts both the Australian Ocean Biodiversity Information System (OBIS) and Global Biodiversity Information Facility (GBIF) nodes within the National Collections and Marine Infrastructure (NCMI) business unit. OBIS-AU is led by the NCMI Information and Data Centre and publishes marine biodiversity data in the Darwin Core (DwC) standard via an Integrated Publishing Toolkit (IPT), with over 450 marine datasets at present. The Australian GBIF node is hosted by a separate team at the Atlas of Living Australia (ALA), a national-scale biodiversity analytical and knowledge delivery portal. The ALA aggregates and publishes over 800 terrestrial and marine datasets from a wide variety of research institutes, museums and collections, governments and citizen science agencies, including OBIS-AU. Many OBIS-AU published datasets are harvested and republished by ALA and vice-versa.\n OBIS-AU identifies, performs Quality Control and formats marine biodiversity and observation data, then publishes directly to the OBIS international data repository and portal, using GBIF IPT technology. The ALA data processing pipeline harvests, aggregates and enhances datasets from many sources with authoritative taxonomic and spatial reference data before passing the data on to GBIF. OBIS-AU and ALA are working together to ensure that the publication pathways for any datasets managed by both (with potential for duplication of records and incomplete metadata harvests) are rationalised and that a single collaborative workflow across both units is followed for publication to GBIF. Recently, the data management groups have established an agreement to cooperatively publish marine data and eDNA data. OBIS-AU have commenced publishing datasets directly to GBIF with ALA endorsement.\n We present the convergent evolution of OBIS and GBIF data publishing in Australia, adaptive data workflows to maintain data and metadata integrity, challenges encountered, how domain expertise ensures data quality and the benefits of sharing data skills and code, especially in publishing eDNA data types in DwC (using the DNA-derived data extension) and exploring the new CamTrap Data Package using Frictionless data. We also present the work that both data groups are doing toward adopting the GBIF new Unified Data model for publishing data. This Australian case study demonstrates the strengths of collaborative data publishing and offers a model that minimises replication of data in global aggregators through the development of regional integrated data publishing pipelines.","PeriodicalId":9011,"journal":{"name":"Biodiversity Information Science and Standards","volume":"17 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biodiversity Information Science and Standards","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3897/biss.7.112228","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The Australian Commonwealth Science and Industrial Research Organisation (CSIRO) hosts both the Australian Ocean Biodiversity Information System (OBIS) and Global Biodiversity Information Facility (GBIF) nodes within the National Collections and Marine Infrastructure (NCMI) business unit. OBIS-AU is led by the NCMI Information and Data Centre and publishes marine biodiversity data in the Darwin Core (DwC) standard via an Integrated Publishing Toolkit (IPT), with over 450 marine datasets at present. The Australian GBIF node is hosted by a separate team at the Atlas of Living Australia (ALA), a national-scale biodiversity analytical and knowledge delivery portal. The ALA aggregates and publishes over 800 terrestrial and marine datasets from a wide variety of research institutes, museums and collections, governments and citizen science agencies, including OBIS-AU. Many OBIS-AU published datasets are harvested and republished by ALA and vice-versa. OBIS-AU identifies, performs Quality Control and formats marine biodiversity and observation data, then publishes directly to the OBIS international data repository and portal, using GBIF IPT technology. The ALA data processing pipeline harvests, aggregates and enhances datasets from many sources with authoritative taxonomic and spatial reference data before passing the data on to GBIF. OBIS-AU and ALA are working together to ensure that the publication pathways for any datasets managed by both (with potential for duplication of records and incomplete metadata harvests) are rationalised and that a single collaborative workflow across both units is followed for publication to GBIF. Recently, the data management groups have established an agreement to cooperatively publish marine data and eDNA data. OBIS-AU have commenced publishing datasets directly to GBIF with ALA endorsement. We present the convergent evolution of OBIS and GBIF data publishing in Australia, adaptive data workflows to maintain data and metadata integrity, challenges encountered, how domain expertise ensures data quality and the benefits of sharing data skills and code, especially in publishing eDNA data types in DwC (using the DNA-derived data extension) and exploring the new CamTrap Data Package using Frictionless data. We also present the work that both data groups are doing toward adopting the GBIF new Unified Data model for publishing data. This Australian case study demonstrates the strengths of collaborative data publishing and offers a model that minimises replication of data in global aggregators through the development of regional integrated data publishing pipelines.
澳大利亚向OBIS和GBIF合作发布数据的模式
澳大利亚联邦科学与工业研究组织(CSIRO)拥有澳大利亚海洋生物多样性信息系统(OBIS)和全球生物多样性信息设施(GBIF)节点,隶属于国家收藏和海洋基础设施(NCMI)业务部门。OBIS-AU由NCMI信息和数据中心领导,通过综合出版工具包(IPT)发布达尔文核心(DwC)标准的海洋生物多样性数据,目前有450多个海洋数据集。澳大利亚GBIF节点由澳大利亚生活地图集(ALA)的一个独立团队主持,ALA是一个全国性的生物多样性分析和知识传递门户。ALA汇集并出版了800多个陆地和海洋数据集,这些数据集来自各种各样的研究机构、博物馆和收藏馆、政府和公民科学机构,包括OBIS-AU。许多OBIS-AU发布的数据集被ALA收集和重新发布,反之亦然。OBIS- au对海洋生物多样性和观测数据进行识别、质量控制和格式化,然后使用GBIF IPT技术直接发布到OBIS国际数据存储库和门户网站。ALA数据处理管道在将数据传递给GBIF之前,收集、汇总和增强来自许多来源的具有权威分类和空间参考数据的数据集。OBIS-AU和ALA正在共同努力,以确保由两者管理的任何数据集的发布路径(可能存在重复记录和不完整的元数据收集)都是合理的,并且遵循跨两个单位的单一协作工作流来发布到GBIF。最近,数据管理小组达成了一项协议,合作发布海洋数据和eDNA数据。OBIS-AU已经开始在ALA的认可下直接向GBIF发布数据集。我们介绍了澳大利亚OBIS和GBIF数据发布的趋同演变,维护数据和元数据完整性的自适应数据工作流程,遇到的挑战,领域专业知识如何确保数据质量以及共享数据技能和代码的好处,特别是在DwC中发布eDNA数据类型(使用dna衍生数据扩展)以及使用无摩擦数据探索新的CamTrap数据包。我们还介绍了两个数据组为采用GBIF新的统一数据模型发布数据所做的工作。这个澳大利亚案例研究展示了协作数据发布的优势,并提供了一个模型,通过开发区域集成数据发布管道,最大限度地减少全球聚合器中的数据复制。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信