{"title":"The Effect of Missing Data on Classification Quality","authors":"Michael Feldman, A. Even, Y. Parmet","doi":"10.5167/UZH-93692","DOIUrl":null,"url":null,"abstract":"The field of data quality management has long rec ognized the negative impact of data quality defects on decision quality. In many decision scenarios, this negative impact can be largely attributed to the m ediating role played by decision-support models - with defected d ata, the estimation of such a model becomes less re liable and, as a result, the likelihood of flawed decisions inc reases. Drawing on that argument, this study presen ts a methodol- ogy for assessing the impact of quality defects on the likelihood of flawed decisions. The methodology is first presented at a high level, and then extended for an alyzing the impact of missing values on binary Line ar Discrimi- nant Analysis (LDA) classifiers. To conclude, we di scuss possible directions for extensions and future directions.","PeriodicalId":270200,"journal":{"name":"MIT International Conference on Information Quality","volume":"487 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"MIT International Conference on Information Quality","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5167/UZH-93692","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The field of data quality management has long rec ognized the negative impact of data quality defects on decision quality. In many decision scenarios, this negative impact can be largely attributed to the m ediating role played by decision-support models - with defected d ata, the estimation of such a model becomes less re liable and, as a result, the likelihood of flawed decisions inc reases. Drawing on that argument, this study presen ts a methodol- ogy for assessing the impact of quality defects on the likelihood of flawed decisions. The methodology is first presented at a high level, and then extended for an alyzing the impact of missing values on binary Line ar Discrimi- nant Analysis (LDA) classifiers. To conclude, we di scuss possible directions for extensions and future directions.