S. Sadiq, T. Dasu, X. Dong, J. Freire, I. Ilyas, S. Link, Renée J. Miller, Felix Naumann, Xiaofang Zhou, D. Srivastava
{"title":"Data Quality: The Role of Empiricism","authors":"S. Sadiq, T. Dasu, X. Dong, J. Freire, I. Ilyas, S. Link, Renée J. Miller, Felix Naumann, Xiaofang Zhou, D. Srivastava","doi":"10.1145/3186549.3186559","DOIUrl":null,"url":null,"abstract":"We outline a call to action for promoting empiricism in data quality research. The action points result from an analysis of the landscape of data quality research. The landscape exhibits two dimensions of empiricism in data quality research relating to type of metrics and scope of method. Our study indicates the presence of a data continuum ranging from real to synthetic data, which has implications for how data quality methods are evaluated. The dimensions of empiricism and their inter-relationships provide a means of positioning data quality research, and help expose limitations, gaps and opportunities.","PeriodicalId":21740,"journal":{"name":"SIGMOD Rec.","volume":"19 1","pages":"35-43"},"PeriodicalIF":0.0000,"publicationDate":"2018-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"37","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"SIGMOD Rec.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3186549.3186559","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 37
Abstract
We outline a call to action for promoting empiricism in data quality research. The action points result from an analysis of the landscape of data quality research. The landscape exhibits two dimensions of empiricism in data quality research relating to type of metrics and scope of method. Our study indicates the presence of a data continuum ranging from real to synthetic data, which has implications for how data quality methods are evaluated. The dimensions of empiricism and their inter-relationships provide a means of positioning data quality research, and help expose limitations, gaps and opportunities.