{"title":"Automating the Identification of High-Value Datasets in Open Government Data Portals","authors":"Alfonso Quarati, Anastasija Nikiforova","doi":"arxiv-2406.10541","DOIUrl":null,"url":null,"abstract":"Recognized for fostering innovation and transparency, driving economic\ngrowth, enhancing public services, supporting research, empowering citizens,\nand promoting environmental sustainability, High-Value Datasets (HVD) play a\ncrucial role in the broader Open Government Data (OGD) movement. However,\nidentifying HVD presents a resource-intensive and complex challenge due to the\nnuanced nature of data value. Our proposal aims to automate the identification\nof HVDs on OGD portals using a quantitative approach based on a detailed\nanalysis of user interest derived from data usage statistics, thereby\nminimizing the need for human intervention. The proposed method involves\nextracting download data, analyzing metrics to identify high-value categories,\nand comparing HVD datasets across different portals. This automated process\nprovides valuable insights into trends in dataset usage, reflecting citizens'\nneeds and preferences. The effectiveness of our approach is demonstrated\nthrough its application to a sample of US OGD city portals. The practical\nimplications of this study include contributing to the understanding of HVD at\nboth local and national levels. By providing a systematic and efficient means\nof identifying HVD, our approach aims to inform open governance initiatives and\npractices, aiding OGD portal managers and public authorities in their efforts\nto optimize data dissemination and utilization.","PeriodicalId":501285,"journal":{"name":"arXiv - CS - Digital Libraries","volume":"87 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Digital Libraries","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2406.10541","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Recognized for fostering innovation and transparency, driving economic
growth, enhancing public services, supporting research, empowering citizens,
and promoting environmental sustainability, High-Value Datasets (HVD) play a
crucial role in the broader Open Government Data (OGD) movement. However,
identifying HVD presents a resource-intensive and complex challenge due to the
nuanced nature of data value. Our proposal aims to automate the identification
of HVDs on OGD portals using a quantitative approach based on a detailed
analysis of user interest derived from data usage statistics, thereby
minimizing the need for human intervention. The proposed method involves
extracting download data, analyzing metrics to identify high-value categories,
and comparing HVD datasets across different portals. This automated process
provides valuable insights into trends in dataset usage, reflecting citizens'
needs and preferences. The effectiveness of our approach is demonstrated
through its application to a sample of US OGD city portals. The practical
implications of this study include contributing to the understanding of HVD at
both local and national levels. By providing a systematic and efficient means
of identifying HVD, our approach aims to inform open governance initiatives and
practices, aiding OGD portal managers and public authorities in their efforts
to optimize data dissemination and utilization.