Using Self-Organizing Maps in constrained ensemble clustering framework

2012 12th International Conference on Intelligent Systems Design and Applications (ISDA) Pub Date : 2012-11-01 DOI:10.1109/ISDA.2012.6416541

R. Visakh

{"title":"Using Self-Organizing Maps in constrained ensemble clustering framework","authors":"R. Visakh","doi":"10.1109/ISDA.2012.6416541","DOIUrl":null,"url":null,"abstract":"Clustering is a predominant data mining task which attempts to partition a group of unlabelled data instances into distinct clusters. The clusters so obtained will have maximum intra-cluster similarity and minimum inter-cluster similarity. Several clustering techniques have been proposed in literature, which includes stand-alone as well as ensemble clustering techniques. Most of them lack robustness and suffer from an important drawback that they cannot effectively visualize clustering results to help knowledge discovery and constructive learning. Recently, clustering techniques via visualization of data have been proposed. These rely on building a Self Organizing Map (SOM) originally proposed by Kohonen. Even though Kohonen SOM preserves topology of the input data, it is widely observed that the clustering accuracy achieved by SOM is poor. To perform robust and accurate clustering using SOM, a cluster ensemble framework based on input constraints is proposed in this paper. Cluster ensemble is a set of clustering solutions obtained as a result of individual clustering on subsets of the original high-dimensional data. The final consensus matrix is fed to a neural network which transforms the input data to a lower-dimensional output map. The map clearly depicts the distribution of input data instances into clusters.","PeriodicalId":370150,"journal":{"name":"2012 12th International Conference on Intelligent Systems Design and Applications (ISDA)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 12th International Conference on Intelligent Systems Design and Applications (ISDA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISDA.2012.6416541","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

Clustering is a predominant data mining task which attempts to partition a group of unlabelled data instances into distinct clusters. The clusters so obtained will have maximum intra-cluster similarity and minimum inter-cluster similarity. Several clustering techniques have been proposed in literature, which includes stand-alone as well as ensemble clustering techniques. Most of them lack robustness and suffer from an important drawback that they cannot effectively visualize clustering results to help knowledge discovery and constructive learning. Recently, clustering techniques via visualization of data have been proposed. These rely on building a Self Organizing Map (SOM) originally proposed by Kohonen. Even though Kohonen SOM preserves topology of the input data, it is widely observed that the clustering accuracy achieved by SOM is poor. To perform robust and accurate clustering using SOM, a cluster ensemble framework based on input constraints is proposed in this paper. Cluster ensemble is a set of clustering solutions obtained as a result of individual clustering on subsets of the original high-dimensional data. The final consensus matrix is fed to a neural network which transforms the input data to a lower-dimensional output map. The map clearly depicts the distribution of input data instances into clusters.

查看原文本刊更多论文

约束集成聚类框架中的自组织映射

聚类是一种主要的数据挖掘任务，它试图将一组未标记的数据实例划分为不同的集群。这样得到的聚类具有最大的簇内相似度和最小的簇间相似度。文献中提出了几种聚类技术，包括独立聚类技术和集成聚类技术。它们大多缺乏鲁棒性，并且存在一个重要的缺点，即它们不能有效地将聚类结果可视化，以帮助知识发现和建设性学习。近年来，人们提出了基于数据可视化的聚类技术。这些依赖于Kohonen最初提出的构建自组织地图(SOM)。尽管Kohonen SOM保留了输入数据的拓扑结构，但广泛观察到SOM的聚类精度较差。为了使用SOM实现鲁棒性和准确性的聚类，本文提出了一种基于输入约束的聚类集成框架。聚类集成是对原始高维数据的子集进行单独聚类而得到的一组聚类解。最终的共识矩阵被送入神经网络，神经网络将输入数据转换为低维输出映射。该映射清楚地描述了输入数据实例在集群中的分布。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2012 12th International Conference on Intelligent Systems Design and Applications (ISDA)

自引率

0.00%

发文量