Marcus Vinicius Ferreira Gonçalves, Jamile Santos, Caio Zava Ferreira, Jorge Juan Zavaleta Gavidia, Sérgio Manuel Serra da Cruz, Jonice Oliveira
{"title":"Curating, Enriching and FAIRifying Datasets from the Brazilian COVID-19 Vaccination","authors":"Marcus Vinicius Ferreira Gonçalves, Jamile Santos, Caio Zava Ferreira, Jorge Juan Zavaleta Gavidia, Sérgio Manuel Serra da Cruz, Jonice Oliveira","doi":"10.5753/jidm.2022.2356","DOIUrl":null,"url":null,"abstract":"As the world struggles to face the challenges of vaccination against COVID-19, more attention needs to be paid to the issues related to the lack of transparency and accessibility of curated vaccination datasets. Among the strategies to combat COVID-19, vaccination and data-centered epidemiological investigations are the best ones. This paper presents the process of building cured and annotated datasets with provenance metadata. The primary dataset is based on the registration data of the Vaccination Campaign against COVID-19 in Brazil. The dataset contains thousands of records processed up to March 2021. The data were analyzed, treated, cross-checked, and linked with other sources to correct and complement them, resulting in cured datasets and aligned to the FAIR Data principles.","PeriodicalId":301338,"journal":{"name":"J. Inf. Data Manag.","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"J. Inf. Data Manag.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5753/jidm.2022.2356","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
As the world struggles to face the challenges of vaccination against COVID-19, more attention needs to be paid to the issues related to the lack of transparency and accessibility of curated vaccination datasets. Among the strategies to combat COVID-19, vaccination and data-centered epidemiological investigations are the best ones. This paper presents the process of building cured and annotated datasets with provenance metadata. The primary dataset is based on the registration data of the Vaccination Campaign against COVID-19 in Brazil. The dataset contains thousands of records processed up to March 2021. The data were analyzed, treated, cross-checked, and linked with other sources to correct and complement them, resulting in cured datasets and aligned to the FAIR Data principles.