{"title":"The Data Repurposing Challenge","authors":"Philip Woodall","doi":"10.1145/3022698","DOIUrl":null,"url":null,"abstract":"When data is collected for the first time, the data collector has in mind the data quality requirements that must be satisfied before it can be used successfully—that is, the data collector ensures “fitness for use”—the commonly agreed upon definition of data quality [Wang and Strong 1996]. However, data that is repurposed [Woodall and Wainman 2015], as opposed to reused, must be managed with multiple different fitness for use requirements in mind, which complicates any data quality enhancements [Ballou and Pazer 1985]. While other work has considered context in relation to data quality requirements, including the need to meet multiple fitness for use requirements [Watts et al. 2009; Bertossi et al. 2011], in the current fast-paced environment of data repurposing for analytics and business intelligence, there are new challenges for dealing with multiple fitness for use requirements in the context of:","PeriodicalId":15582,"journal":{"name":"Journal of Data and Information Quality (JDIQ)","volume":"97 1","pages":"1 - 4"},"PeriodicalIF":0.0000,"publicationDate":"2017-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Data and Information Quality (JDIQ)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3022698","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
When data is collected for the first time, the data collector has in mind the data quality requirements that must be satisfied before it can be used successfully—that is, the data collector ensures “fitness for use”—the commonly agreed upon definition of data quality [Wang and Strong 1996]. However, data that is repurposed [Woodall and Wainman 2015], as opposed to reused, must be managed with multiple different fitness for use requirements in mind, which complicates any data quality enhancements [Ballou and Pazer 1985]. While other work has considered context in relation to data quality requirements, including the need to meet multiple fitness for use requirements [Watts et al. 2009; Bertossi et al. 2011], in the current fast-paced environment of data repurposing for analytics and business intelligence, there are new challenges for dealing with multiple fitness for use requirements in the context of:
当数据第一次被收集时,数据收集者会考虑在数据成功使用之前必须满足的数据质量要求——也就是说,数据收集者要确保“适合使用”——这是数据质量的普遍定义[Wang and Strong 1996]。然而,重新利用的数据[Woodall and Wainman 2015]与重用的数据不同,必须考虑到多种不同的使用适应性要求来管理,这使得任何数据质量增强都变得复杂[Ballou and Pazer 1985]。虽然其他工作已经考虑了与数据质量要求相关的上下文,包括满足多重适应度使用要求的需要[Watts等人,2009;Bertossi et al. 2011],在当前数据重新用于分析和商业智能的快节奏环境中,在以下背景下处理多重适应度使用需求存在新的挑战: