{"title":"A Comparison of global and local probabilistic approximations in mining data with many missing attribute values","authors":"Patrick G. Clark, J. Grzymala-Busse","doi":"10.1109/GrC.2013.6740384","DOIUrl":null,"url":null,"abstract":"We present results of a novel experimental comparison of global and local probabilistic approximations. Global approximations are unions of characteristic sets while local approximations are constructed from blocks of attributevalue pairs. Two interpretations of missing attribute values are discussed: lost values and “do not care” conditions. Our main objective was to compare global and local probabilistic approximations in terms of the error rate. For our experiments we used six incomplete data sets with many missing attribute values. The best results were accomplished by global approximations (for two data sets), by local approximations (for one data set), and for the remaining three data sets the experiments ended with ties. Our next objective was to check the quality of non-standard probabilistic approximations, i.e., probabilistic approximations that were neither lower nor upper approximations. For four data sets the smallest error rate was accomplished by non-standard probabilistic approximations, for the remaining two data sets the smallest error rate was accomplished by upper approximations. Our final objective was to compare two interpretations of missing attribute values. For three data sets the best interpretation was the lost value, for one data set it was the “do not care” condition, for the remaining two cases there was a tie.","PeriodicalId":415445,"journal":{"name":"2013 IEEE International Conference on Granular Computing (GrC)","volume":"77 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 IEEE International Conference on Granular Computing (GrC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/GrC.2013.6740384","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
We present results of a novel experimental comparison of global and local probabilistic approximations. Global approximations are unions of characteristic sets while local approximations are constructed from blocks of attributevalue pairs. Two interpretations of missing attribute values are discussed: lost values and “do not care” conditions. Our main objective was to compare global and local probabilistic approximations in terms of the error rate. For our experiments we used six incomplete data sets with many missing attribute values. The best results were accomplished by global approximations (for two data sets), by local approximations (for one data set), and for the remaining three data sets the experiments ended with ties. Our next objective was to check the quality of non-standard probabilistic approximations, i.e., probabilistic approximations that were neither lower nor upper approximations. For four data sets the smallest error rate was accomplished by non-standard probabilistic approximations, for the remaining two data sets the smallest error rate was accomplished by upper approximations. Our final objective was to compare two interpretations of missing attribute values. For three data sets the best interpretation was the lost value, for one data set it was the “do not care” condition, for the remaining two cases there was a tie.