A new clustering strategy with stochastic merging and removing based on kernel functions

Huimin Geng, H. Ali
{"title":"A new clustering strategy with stochastic merging and removing based on kernel functions","authors":"Huimin Geng, H. Ali","doi":"10.1109/CSBW.2005.10","DOIUrl":null,"url":null,"abstract":"With hierarchical clustering methods, divisions or fusions, once made, are irrevocable. As a result, when two elements in a bottom-up algorithm are assigned to one cluster, they cannot subsequently be separated. Also, when a top-down algorithm separates two elements, they can't be rejoined. Such greedy property may lead to premature convergence and consequently lead to a clustering that is far from optimal. To overcome this problem, we propose a new Stochastic Message Passing Clustering (SMPC) method based on the Message Passing Clustering (MPC) algorithm introduced in our earlier work. SMPC, as a generalized version of MPC, extends the clustering algorithm from a deterministic process to a stochastic process, adding two major advantages. First, in deciding the merging cluster pair, the influences of all clusters are quantified by probabilities, estimated by kernel functions based on their relative distances. Secondly, clustering can be undone to improve the clustering performance when the algorithm detects elements which don't have good probabilities inside the cluster and moves them outside. The test results on colon cancer gene-expression data show that SMPC performs better than the deterministic MPC or hierarchical clustering method.","PeriodicalId":123531,"journal":{"name":"2005 IEEE Computational Systems Bioinformatics Conference - Workshops (CSBW'05)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2005-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2005 IEEE Computational Systems Bioinformatics Conference - Workshops (CSBW'05)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSBW.2005.10","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

With hierarchical clustering methods, divisions or fusions, once made, are irrevocable. As a result, when two elements in a bottom-up algorithm are assigned to one cluster, they cannot subsequently be separated. Also, when a top-down algorithm separates two elements, they can't be rejoined. Such greedy property may lead to premature convergence and consequently lead to a clustering that is far from optimal. To overcome this problem, we propose a new Stochastic Message Passing Clustering (SMPC) method based on the Message Passing Clustering (MPC) algorithm introduced in our earlier work. SMPC, as a generalized version of MPC, extends the clustering algorithm from a deterministic process to a stochastic process, adding two major advantages. First, in deciding the merging cluster pair, the influences of all clusters are quantified by probabilities, estimated by kernel functions based on their relative distances. Secondly, clustering can be undone to improve the clustering performance when the algorithm detects elements which don't have good probabilities inside the cluster and moves them outside. The test results on colon cancer gene-expression data show that SMPC performs better than the deterministic MPC or hierarchical clustering method.
一种基于核函数的随机合并和去除聚类策略
使用分层聚类方法,分割或融合一旦完成,就不可撤销。因此,当自下而上算法中的两个元素被分配到一个簇中时,它们随后就不能被分离。此外,当自顶向下算法分离两个元素时,它们不能重新连接。这种贪婪性质可能导致过早收敛,从而导致离最优聚类很远的聚类。为了克服这个问题,我们在之前的工作中介绍的消息传递聚类(MPC)算法的基础上提出了一种新的随机消息传递聚类(SMPC)方法。SMPC作为MPC的广义版本,将聚类算法从确定性过程扩展到随机过程,增加了两个主要优点。首先,在确定合并聚类对时,通过概率量化所有聚类的影响,并根据它们的相对距离用核函数估计。其次,当算法检测到簇内概率不高的元素并将其移出簇外时,可以取消聚类以提高聚类性能。对结肠癌基因表达数据的检测结果表明,SMPC方法优于确定性MPC方法或分层聚类方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信