一种结合启发式和信息定理的聚类方法

2009 Fifth International Conference on Natural Computation Pub Date : 2009-08-14 DOI:10.1109/ICNC.2009.571

Zeng-Shun Zhao, Z. Hou, M. Tan

{"title":"一种结合启发式和信息定理的聚类方法","authors":"Zeng-Shun Zhao, Z. Hou, M. Tan","doi":"10.1109/ICNC.2009.571","DOIUrl":null,"url":null,"abstract":"Many data mining tasks require the unsupervised partitioning of a data set into clusters. However, in many case we do not really know any prior knowledge about the clusters, for example, the density or the shape. This paper addresses two major issues associated with conventional competitive learning, namely, sensitivity to initialization and difficulty in determining the number of clusters. Many methods exist for such clustering, but most of then have assumed hyper-ellipsoidal clusters. Many heuristically proposed competitive learning methods and its variants, are somewhat ad hoc without any theoretical support. Under above considerations, we propose an algorithm named as Entropy guided Splitting Competitive Learning (ESCL) in the information theorem framework. Simulations show that minimization of partition entropy can be used to guide the competitive learning process, so to estimate the number and structure of probable data generators.","PeriodicalId":235382,"journal":{"name":"2009 Fifth International Conference on Natural Computation","volume":"84 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Novel Clustering Method Combining Heuristics and Information Theorem\",\"authors\":\"Zeng-Shun Zhao, Z. Hou, M. Tan\",\"doi\":\"10.1109/ICNC.2009.571\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Many data mining tasks require the unsupervised partitioning of a data set into clusters. However, in many case we do not really know any prior knowledge about the clusters, for example, the density or the shape. This paper addresses two major issues associated with conventional competitive learning, namely, sensitivity to initialization and difficulty in determining the number of clusters. Many methods exist for such clustering, but most of then have assumed hyper-ellipsoidal clusters. Many heuristically proposed competitive learning methods and its variants, are somewhat ad hoc without any theoretical support. Under above considerations, we propose an algorithm named as Entropy guided Splitting Competitive Learning (ESCL) in the information theorem framework. Simulations show that minimization of partition entropy can be used to guide the competitive learning process, so to estimate the number and structure of probable data generators.\",\"PeriodicalId\":235382,\"journal\":{\"name\":\"2009 Fifth International Conference on Natural Computation\",\"volume\":\"84 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-08-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 Fifth International Conference on Natural Computation\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICNC.2009.571\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 Fifth International Conference on Natural Computation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICNC.2009.571","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

许多数据挖掘任务需要将数据集无监督地划分为簇。然而，在许多情况下，我们并不真正知道任何关于集群的先验知识，例如，密度或形状。本文解决了与传统竞争学习相关的两个主要问题，即对初始化的敏感性和确定集群数量的困难。有许多方法可以实现这种聚类，但大多数方法都假设了超椭球形聚类。许多启发式提出的竞争学习方法及其变体，在某种程度上是临时的，没有任何理论支持。基于以上考虑，我们提出了一种信息定理框架下的熵引导分裂竞争学习算法。仿真结果表明，分区熵的最小化可以用来指导竞争学习过程，从而估计可能的数据生成器的数量和结构。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Novel Clustering Method Combining Heuristics and Information Theorem

Many data mining tasks require the unsupervised partitioning of a data set into clusters. However, in many case we do not really know any prior knowledge about the clusters, for example, the density or the shape. This paper addresses two major issues associated with conventional competitive learning, namely, sensitivity to initialization and difficulty in determining the number of clusters. Many methods exist for such clustering, but most of then have assumed hyper-ellipsoidal clusters. Many heuristically proposed competitive learning methods and its variants, are somewhat ad hoc without any theoretical support. Under above considerations, we propose an algorithm named as Entropy guided Splitting Competitive Learning (ESCL) in the information theorem framework. Simulations show that minimization of partition entropy can be used to guide the competitive learning process, so to estimate the number and structure of probable data generators.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2009 Fifth International Conference on Natural Computation

自引率

0.00%

发文量