D. R. Saleh, Y. Kartika, Zaenal Akbar, A. Krisnadhi, L. Manik
{"title":"OntoBiodiv: Reconnecting Biodiversity Data with Specimens","authors":"D. R. Saleh, Y. Kartika, Zaenal Akbar, A. Krisnadhi, L. Manik","doi":"10.1109/NISS55057.2022.10085505","DOIUrl":null,"url":null,"abstract":"Biodiversity data can be produced from preserved specimens where multiple pieces of information (e.g., taxonomic identification) will be extracted from biological samples or materials. Another approach, observation-based, collects data digitally without actual biological samples or materials. The latter approach has produced much more data compared to the first one. However, with recent technological developments, the tangible samples or materials preserved by the first approach have become gold mines because they opened more opportunities for scientific discovery. For example, a new method for genomic investigation can be performed on specimens collected a decade ago. However, this new investigation will only be possible with preserved specimens. Therefore, it is necessary to shift the focus of biodiversity data collection to the specimens-oriented. Unfortunately, most of the current biodiversity data standards cover specimens minimally. This work proposes a schema to extend an existing biodiversity data standard (i.e., Darwin Core) where specimens are the core. The extension covers a variety of data properties of specimens, including the generalization of multiple kinds of information that can be obtained by extracting from specimens. Comparing the coverage ratio and matching scores with the existing one reveals the superiority of the proposed schema. The evaluation results show that the proposed schema covers up to 80% higher and has the utmost exact match scores for specimen-based biodiversity data. This work initiates our effort to reconnect biodiversity data to specimens.","PeriodicalId":138637,"journal":{"name":"2022 5th International Conference on Networking, Information Systems and Security: Envisage Intelligent Systems in 5g//6G-based Interconnected Digital Worlds (NISS)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 5th International Conference on Networking, Information Systems and Security: Envisage Intelligent Systems in 5g//6G-based Interconnected Digital Worlds (NISS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NISS55057.2022.10085505","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Biodiversity data can be produced from preserved specimens where multiple pieces of information (e.g., taxonomic identification) will be extracted from biological samples or materials. Another approach, observation-based, collects data digitally without actual biological samples or materials. The latter approach has produced much more data compared to the first one. However, with recent technological developments, the tangible samples or materials preserved by the first approach have become gold mines because they opened more opportunities for scientific discovery. For example, a new method for genomic investigation can be performed on specimens collected a decade ago. However, this new investigation will only be possible with preserved specimens. Therefore, it is necessary to shift the focus of biodiversity data collection to the specimens-oriented. Unfortunately, most of the current biodiversity data standards cover specimens minimally. This work proposes a schema to extend an existing biodiversity data standard (i.e., Darwin Core) where specimens are the core. The extension covers a variety of data properties of specimens, including the generalization of multiple kinds of information that can be obtained by extracting from specimens. Comparing the coverage ratio and matching scores with the existing one reveals the superiority of the proposed schema. The evaluation results show that the proposed schema covers up to 80% higher and has the utmost exact match scores for specimen-based biodiversity data. This work initiates our effort to reconnect biodiversity data to specimens.