A Hierarchical Clustering for Categorical Data Based on Holo-Entropy

Hao-jun Sun, Rongbo Chen, Shulin Jin, Yong Qin
{"title":"A Hierarchical Clustering for Categorical Data Based on Holo-Entropy","authors":"Hao-jun Sun, Rongbo Chen, Shulin Jin, Yong Qin","doi":"10.1109/WISA.2015.18","DOIUrl":null,"url":null,"abstract":"High dimensional data clustering is a difficult task in clustering analysis. Subspace clustering is an effective approach. The principle of subspace clustering is to maximize the retention of the original data information while searching for the minimal size of subspace for cluster representation. Based on information entropy and Holo-entropy, we propose an adaptive high dimensional weighted subspace clustering algorithm. The algorithm employs information entropy to extract the feature subspace, uses class compactness which binding Holo-entropy with weight in subspace for sub-clusters merging instead of the traditional similarity measurement method, and it selects the most compacted two sub-clusters to merge to achieve the maximum degree clustering effect. The algorithm is tested on nine UCI dataset, and compared with other algorithms. Our algorithm is better in both efficiency and accuracy than the other existing algorithms and has high reproducibility.","PeriodicalId":198938,"journal":{"name":"2015 12th Web Information System and Application Conference (WISA)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 12th Web Information System and Application Conference (WISA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WISA.2015.18","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

High dimensional data clustering is a difficult task in clustering analysis. Subspace clustering is an effective approach. The principle of subspace clustering is to maximize the retention of the original data information while searching for the minimal size of subspace for cluster representation. Based on information entropy and Holo-entropy, we propose an adaptive high dimensional weighted subspace clustering algorithm. The algorithm employs information entropy to extract the feature subspace, uses class compactness which binding Holo-entropy with weight in subspace for sub-clusters merging instead of the traditional similarity measurement method, and it selects the most compacted two sub-clusters to merge to achieve the maximum degree clustering effect. The algorithm is tested on nine UCI dataset, and compared with other algorithms. Our algorithm is better in both efficiency and accuracy than the other existing algorithms and has high reproducibility.
基于全熵的分类数据分层聚类
高维数据聚类是聚类分析中的难点。子空间聚类是一种有效的方法。子空间聚类的原理是在最大限度地保留原始数据信息的同时,寻找最小尺寸的子空间进行聚类表示。基于信息熵和全息熵,提出了一种自适应高维加权子空间聚类算法。该算法利用信息熵提取特征子空间,利用将holo -熵与子空间权值绑定的类紧密度代替传统的相似度度量方法进行子聚类合并,并选择最紧密的两个子聚类进行合并,以达到最大的聚类效果。该算法在9个UCI数据集上进行了测试,并与其他算法进行了比较。该算法在效率和精度上都优于现有的算法,并且具有较高的再现性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信