{"title":"Spatial Contextualization for Closed Itemset Mining","authors":"Altobelli B. Mantuan, L. Fernandes","doi":"10.1109/ICDM.2018.00155","DOIUrl":null,"url":null,"abstract":"We present the Spatial Contextualization for Closed Itemset Mining (SCIM) algorithm, an approach that builds a space for the target database in such a way that relevant itemsets can be retrieved regarding the relative spatial location of their items. Our algorithm uses Dual Scaling to map the items of the database to a multidimensional space called Solution Space. The representation of the database in the Solution Space assists in the interpretation and definition of overlapping clusters of related items. Therefore, instead of using the minimum support threshold, a distance threshold is defined concerning the reference and the maximum distances computed per cluster during the mapping procedure. Closed itemsets are efficiently retrieved by a new procedure that uses an FP-Tree, a CFI-Tree and the proposed spatial contextualization. Experiments show that the mean all-confidence measure of itemsets retrieved by our technique outperforms results from state-of-the-art algorithms. Additionally, we use the Minimum Description Length (MDL) metric to verify how descriptive are the collections of mined patterns.","PeriodicalId":286444,"journal":{"name":"2018 IEEE International Conference on Data Mining (ICDM)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE International Conference on Data Mining (ICDM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDM.2018.00155","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
We present the Spatial Contextualization for Closed Itemset Mining (SCIM) algorithm, an approach that builds a space for the target database in such a way that relevant itemsets can be retrieved regarding the relative spatial location of their items. Our algorithm uses Dual Scaling to map the items of the database to a multidimensional space called Solution Space. The representation of the database in the Solution Space assists in the interpretation and definition of overlapping clusters of related items. Therefore, instead of using the minimum support threshold, a distance threshold is defined concerning the reference and the maximum distances computed per cluster during the mapping procedure. Closed itemsets are efficiently retrieved by a new procedure that uses an FP-Tree, a CFI-Tree and the proposed spatial contextualization. Experiments show that the mean all-confidence measure of itemsets retrieved by our technique outperforms results from state-of-the-art algorithms. Additionally, we use the Minimum Description Length (MDL) metric to verify how descriptive are the collections of mined patterns.