{"title":"Entropy-based algorithm for discovering groups with mixed type attributes","authors":"E. Hernández, Xiaoou Li, L.E. Rocha","doi":"10.1109/ICEEE.2006.251892","DOIUrl":null,"url":null,"abstract":"The majority of the clustering algorithms are focused on datasets with only numeric or categorical attributes. Recently, the problem of clustering mixed data has drawn interest due to the fact that many real life applications have mixed data. In this research work, we propose a clustering algorithm called ACEM that is able to deal with mixed data. This algorithm makes a pre-clustering on the pure categorical data. Then including all mixed data it evaluates the clusters using an entropy-based criterion in order to verify the cluster membership of the data. As result, we obtain a clustering algorithm for mixed data whose main idea is to extend a categorical clustering algorithm introducing an entropy criterion to measure the cluster heterogeneity. We make comparisons with other clustering algorithms on real life datasets to illustrate our algorithm performance","PeriodicalId":125310,"journal":{"name":"2006 3rd International Conference on Electrical and Electronics Engineering","volume":"56 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2006 3rd International Conference on Electrical and Electronics Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICEEE.2006.251892","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
The majority of the clustering algorithms are focused on datasets with only numeric or categorical attributes. Recently, the problem of clustering mixed data has drawn interest due to the fact that many real life applications have mixed data. In this research work, we propose a clustering algorithm called ACEM that is able to deal with mixed data. This algorithm makes a pre-clustering on the pure categorical data. Then including all mixed data it evaluates the clusters using an entropy-based criterion in order to verify the cluster membership of the data. As result, we obtain a clustering algorithm for mixed data whose main idea is to extend a categorical clustering algorithm introducing an entropy criterion to measure the cluster heterogeneity. We make comparisons with other clustering algorithms on real life datasets to illustrate our algorithm performance