Syed Iftikhar Husain Shah, Vassilios Peristeras, Ioannis Magnisalis
{"title":"Government Big Data Ecosystem: Definitions, Types of Data, Actors, and Roles and the Impact in Public Administrations","authors":"Syed Iftikhar Husain Shah, Vassilios Peristeras, Ioannis Magnisalis","doi":"10.1145/3425709","DOIUrl":"https://doi.org/10.1145/3425709","url":null,"abstract":"The public sector, private firms, business community, and civil society are generating data that are high in volume, veracity, and velocity and come from a diversity of sources. This type of data is today known as big data. Public administrations pursue big data as “new oil” and implement data-centric policies to collect, generate, process, share, exploit, and protect data for promoting good governance, transparency, innovative digital services, and citizens’ engagement in public policy. All of the above constitute the Government Big Data Ecosystem (GBDE). Despite the great interest in this ecosystem, there is a lack of clear definitions, the various important types of government data remain vague, the different actors and their roles are not well defined, while the impact in key public administration sectors is not yet deeply understood and assessed. Such research and literature gaps impose a crucial obstacle for a better understanding of the prospects and nascent issues in exploiting GBDE. With this study, we aim to start filling the above-mentioned gaps by organizing our findings from an extended Systematic Literature Review into a framework to organise and address the above-mentioned challenges. Our goal is to contribute in this fast-evolving area by bringing some clarity and establishing common understanding around key elements of the emerging GBDE.","PeriodicalId":44355,"journal":{"name":"ACM Journal of Data and Information Quality","volume":"17 1","pages":"1 - 25"},"PeriodicalIF":2.1,"publicationDate":"2021-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86720766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Subjectivity in the Creation of Machine Learning Models","authors":"L. CummingsMary, LiSongpo","doi":"10.1145/3418034","DOIUrl":"https://doi.org/10.1145/3418034","url":null,"abstract":"Transportation analysts are inundated with requests to apply popular machine learning modeling techniques to datasets to uncover never-before-seen relationships that could potentially revolutionize...","PeriodicalId":44355,"journal":{"name":"ACM Journal of Data and Information Quality","volume":"13 1","pages":"1-19"},"PeriodicalIF":2.1,"publicationDate":"2021-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/3418034","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"64033851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Toward a Complete Data Valuation Process. Challenges of Personal Data","authors":"Mihnea Tufis, Ludovico Boratto","doi":"10.1145/3447269","DOIUrl":"https://doi.org/10.1145/3447269","url":null,"abstract":"","PeriodicalId":44355,"journal":{"name":"ACM Journal of Data and Information Quality","volume":"13 1","pages":"20:1-20:7"},"PeriodicalIF":2.1,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"64037731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Philipp Lämmel, Benjamin Dittwald, Lina Bruns, Nikolay Tcholtchev, Yuri Glikman, S. Cuno, Mathias Flügge, I. Schieferdecker
{"title":"Metadata Harvesting and Quality Assurance within Open Urban Platforms","authors":"Philipp Lämmel, Benjamin Dittwald, Lina Bruns, Nikolay Tcholtchev, Yuri Glikman, S. Cuno, Mathias Flügge, I. Schieferdecker","doi":"10.1145/3409795","DOIUrl":"https://doi.org/10.1145/3409795","url":null,"abstract":"During the past years, various activities and concepts have shaped and prepared the path for the development of urban environments toward smart cities across the world. One of the initial activities was relating to the opening of vast amounts of data from various public administrations and utility companies within a city in order to create a viable eco-system of urban services and applications. Thereby, the harvested metadata needed to be verified in terms of correctness and a corresponding level of quality had to be assured. In addition, the concept of an Open Urban Platform emerged as an overall solution for smart cities Information Communication Technology (ICT) in the sense that an abstract reference model was established and standardized, providing an overall picture of the ICT structures within a city. Within this article, we use the Open Urban Platform concept as the basics to describe and map our activities within the Open Data domain, focusing mainly on the Open Data prototype for German Open Governmental Data—namely GovData.DE. Thereby, we describe our metadata harvesting and metadata quality assurance approach and discuss on lessons learned, which flow into the definition of metadata quality metrics and have the potential to lead to a corresponding standard within the Deutsches Institut für Normung e.V. (DIN) German national standardization.","PeriodicalId":44355,"journal":{"name":"ACM Journal of Data and Information Quality","volume":"12 1","pages":"1 - 20"},"PeriodicalIF":2.1,"publicationDate":"2020-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/3409795","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"64031563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Data Profiling in Property Graph Databases","authors":"MaioloSofía, EtcheverryLorena, MarottaAdriana","doi":"10.1145/3409473","DOIUrl":"https://doi.org/10.1145/3409473","url":null,"abstract":"Property Graph databases are being increasingly used within the industry as a powerful and flexible way to model real-world scenarios. With this flexibility, a great challenge appears regarding pro...","PeriodicalId":44355,"journal":{"name":"ACM Journal of Data and Information Quality","volume":"12 1","pages":"1-27"},"PeriodicalIF":2.1,"publicationDate":"2020-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/3409473","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"64031470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Transforming Pairwise Duplicates to Entity Clusters for High-quality Duplicate Detection","authors":"DraisbachUwe, ChristenPeter, NaumannFelix","doi":"10.1145/3352591","DOIUrl":"https://doi.org/10.1145/3352591","url":null,"abstract":"Duplicate detection algorithms produce clusters of database records, each cluster representing a single real-world entity. As most of these algorithms use pairwise comparisons, the resulting (trans...","PeriodicalId":44355,"journal":{"name":"ACM Journal of Data and Information Quality","volume":"12 1","pages":"1-30"},"PeriodicalIF":2.1,"publicationDate":"2020-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/3352591","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48569568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Completing and Debugging Ontologies: State of the Art and Challenges in Repairing Ontologies","authors":"P. Lambrix","doi":"10.1145/3597304","DOIUrl":"https://doi.org/10.1145/3597304","url":null,"abstract":"As semantically-enabled applications require high-quality ontologies, developing and maintaining ontologies that are as correct and complete as possible is an important although difficult task in ontology engineering. A key task is ontology debugging and completion. In general, there are two steps: detecting defects and repairing defects. In this paper we discuss the state of the art regarding the repairing step. We do this by formalizing the repairing step as an abductive reasoning problem and situating the state of the art with respect to this framework. We show that there are still many open research problems and show opportunities for further work and advancing the field.","PeriodicalId":44355,"journal":{"name":"ACM Journal of Data and Information Quality","volume":"14 1","pages":""},"PeriodicalIF":2.1,"publicationDate":"2019-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85508766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Call for Papers Special Issue on Entity Resolution","authors":"J. Talburt, S. Madnick, Yang W. Lee","doi":"10.1145/1805286.1805292","DOIUrl":"https://doi.org/10.1145/1805286.1805292","url":null,"abstract":"Entity resolution (ER) is a key process for improving data quality in data integration in modern information systems. ER covers a wide range of approaches to entity-based integration, known variously as merge/purge, record de-duplication, heterogeneous join, identity resolution, and customer recognition. More broadly, ER also includes a number of important preand post-integration activities, such as entity reference extraction and entity relationship analysis. Based on direct record matching strategies, such as those described by the Fellegi-Sunter Model, new theoretical frameworks are evolving to describe ER processes and outcomes that include other types of inferred and asserted reference linking techniques. Businesses have long recognized that the quality of their ER processes directly impacts the overall value of their information assets and the quality of the information products they produce. Government agencies and departments, including law enforcement and the intelligence community, are increasing their use of ER as a tool for accomplishing their missions as well. Recognizing the growing interest in ER theory and practice, and its impact on information quality in organizations, the ACM Journal of Data and Information Quality (JDIQ) will devote a special issue to innovative and high-quality research papers in this area. Papers that address any aspect of entity resolution are welcome.","PeriodicalId":44355,"journal":{"name":"ACM Journal of Data and Information Quality","volume":"2 1","pages":"6"},"PeriodicalIF":2.1,"publicationDate":"2010-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/1805286.1805292","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"64113805","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}