Martin J. O'Connor, Josef Hardi, Marcos Martínez-Romero, Sowmya Somasundaram, Brendan Honick, Stephen A. Fisher, Ajay Pillai, Mark A. Musen
{"title":"Ensuring Adherence to Standards in Experiment-Related Metadata Entered Via Spreadsheets","authors":"Martin J. O'Connor, Josef Hardi, Marcos Martínez-Romero, Sowmya Somasundaram, Brendan Honick, Stephen A. Fisher, Ajay Pillai, Mark A. Musen","doi":"arxiv-2409.08897","DOIUrl":null,"url":null,"abstract":"Scientists increasingly recognize the importance of providing rich,\nstandards-adherent metadata to describe their experimental results. Despite the\navailability of sophisticated tools to assist in the process of data\nannotation, investigators generally seem to prefer to use spreadsheets when\nsupplying metadata, despite the limitations of spreadsheets in ensuring\nmetadata consistency and compliance with formal specifications. In this paper,\nwe describe an end-to-end approach that supports spreadsheet-based entry of\nmetadata, while ensuring rigorous adherence to community-based metadata\nstandards and providing quality control. Our methods employ several key\ncomponents, including customizable templates that capture metadata standards\nand that can inform the spreadsheets that investigators use to author metadata,\ncontrolled terminologies and ontologies for defining metadata values that can\nbe accessed directly from a spreadsheet, and an interactive Web-based tool that\nallows users to rapidly identify and fix errors in their spreadsheet-based\nmetadata. We demonstrate how this approach is being deployed in a biomedical\nconsortium known as HuBMAP to define and collect metadata about a wide range of\nbiological assays.","PeriodicalId":501285,"journal":{"name":"arXiv - CS - Digital Libraries","volume":"7 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Digital Libraries","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.08897","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Scientists increasingly recognize the importance of providing rich,
standards-adherent metadata to describe their experimental results. Despite the
availability of sophisticated tools to assist in the process of data
annotation, investigators generally seem to prefer to use spreadsheets when
supplying metadata, despite the limitations of spreadsheets in ensuring
metadata consistency and compliance with formal specifications. In this paper,
we describe an end-to-end approach that supports spreadsheet-based entry of
metadata, while ensuring rigorous adherence to community-based metadata
standards and providing quality control. Our methods employ several key
components, including customizable templates that capture metadata standards
and that can inform the spreadsheets that investigators use to author metadata,
controlled terminologies and ontologies for defining metadata values that can
be accessed directly from a spreadsheet, and an interactive Web-based tool that
allows users to rapidly identify and fix errors in their spreadsheet-based
metadata. We demonstrate how this approach is being deployed in a biomedical
consortium known as HuBMAP to define and collect metadata about a wide range of
biological assays.