一种结合启发式和信息定理的聚类方法

Zeng-Shun Zhao, Z. Hou, M. Tan
{"title":"一种结合启发式和信息定理的聚类方法","authors":"Zeng-Shun Zhao, Z. Hou, M. Tan","doi":"10.1109/ICNC.2009.571","DOIUrl":null,"url":null,"abstract":"Many data mining tasks require the unsupervised partitioning of a data set into clusters. However, in many case we do not really know any prior knowledge about the clusters, for example, the density or the shape. This paper addresses two major issues associated with conventional competitive learning, namely, sensitivity to initialization and difficulty in determining the number of clusters. Many methods exist for such clustering, but most of then have assumed hyper-ellipsoidal clusters. Many heuristically proposed competitive learning methods and its variants, are somewhat ad hoc without any theoretical support. Under above considerations, we propose an algorithm named as Entropy guided Splitting Competitive Learning (ESCL) in the information theorem framework. Simulations show that minimization of partition entropy can be used to guide the competitive learning process, so to estimate the number and structure of probable data generators.","PeriodicalId":235382,"journal":{"name":"2009 Fifth International Conference on Natural Computation","volume":"84 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Novel Clustering Method Combining Heuristics and Information Theorem\",\"authors\":\"Zeng-Shun Zhao, Z. Hou, M. Tan\",\"doi\":\"10.1109/ICNC.2009.571\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Many data mining tasks require the unsupervised partitioning of a data set into clusters. However, in many case we do not really know any prior knowledge about the clusters, for example, the density or the shape. This paper addresses two major issues associated with conventional competitive learning, namely, sensitivity to initialization and difficulty in determining the number of clusters. Many methods exist for such clustering, but most of then have assumed hyper-ellipsoidal clusters. Many heuristically proposed competitive learning methods and its variants, are somewhat ad hoc without any theoretical support. Under above considerations, we propose an algorithm named as Entropy guided Splitting Competitive Learning (ESCL) in the information theorem framework. Simulations show that minimization of partition entropy can be used to guide the competitive learning process, so to estimate the number and structure of probable data generators.\",\"PeriodicalId\":235382,\"journal\":{\"name\":\"2009 Fifth International Conference on Natural Computation\",\"volume\":\"84 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-08-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 Fifth International Conference on Natural Computation\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICNC.2009.571\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 Fifth International Conference on Natural Computation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICNC.2009.571","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

许多数据挖掘任务需要将数据集无监督地划分为簇。然而,在许多情况下,我们并不真正知道任何关于集群的先验知识,例如,密度或形状。本文解决了与传统竞争学习相关的两个主要问题,即对初始化的敏感性和确定集群数量的困难。有许多方法可以实现这种聚类,但大多数方法都假设了超椭球形聚类。许多启发式提出的竞争学习方法及其变体,在某种程度上是临时的,没有任何理论支持。基于以上考虑,我们提出了一种信息定理框架下的熵引导分裂竞争学习算法。仿真结果表明,分区熵的最小化可以用来指导竞争学习过程,从而估计可能的数据生成器的数量和结构。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A Novel Clustering Method Combining Heuristics and Information Theorem
Many data mining tasks require the unsupervised partitioning of a data set into clusters. However, in many case we do not really know any prior knowledge about the clusters, for example, the density or the shape. This paper addresses two major issues associated with conventional competitive learning, namely, sensitivity to initialization and difficulty in determining the number of clusters. Many methods exist for such clustering, but most of then have assumed hyper-ellipsoidal clusters. Many heuristically proposed competitive learning methods and its variants, are somewhat ad hoc without any theoretical support. Under above considerations, we propose an algorithm named as Entropy guided Splitting Competitive Learning (ESCL) in the information theorem framework. Simulations show that minimization of partition entropy can be used to guide the competitive learning process, so to estimate the number and structure of probable data generators.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信