{"title":"A new clustering strategy with stochastic merging and removing based on kernel functions","authors":"Huimin Geng, H. Ali","doi":"10.1109/CSBW.2005.10","DOIUrl":null,"url":null,"abstract":"With hierarchical clustering methods, divisions or fusions, once made, are irrevocable. As a result, when two elements in a bottom-up algorithm are assigned to one cluster, they cannot subsequently be separated. Also, when a top-down algorithm separates two elements, they can't be rejoined. Such greedy property may lead to premature convergence and consequently lead to a clustering that is far from optimal. To overcome this problem, we propose a new Stochastic Message Passing Clustering (SMPC) method based on the Message Passing Clustering (MPC) algorithm introduced in our earlier work. SMPC, as a generalized version of MPC, extends the clustering algorithm from a deterministic process to a stochastic process, adding two major advantages. First, in deciding the merging cluster pair, the influences of all clusters are quantified by probabilities, estimated by kernel functions based on their relative distances. Secondly, clustering can be undone to improve the clustering performance when the algorithm detects elements which don't have good probabilities inside the cluster and moves them outside. The test results on colon cancer gene-expression data show that SMPC performs better than the deterministic MPC or hierarchical clustering method.","PeriodicalId":123531,"journal":{"name":"2005 IEEE Computational Systems Bioinformatics Conference - Workshops (CSBW'05)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2005-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2005 IEEE Computational Systems Bioinformatics Conference - Workshops (CSBW'05)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSBW.2005.10","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
With hierarchical clustering methods, divisions or fusions, once made, are irrevocable. As a result, when two elements in a bottom-up algorithm are assigned to one cluster, they cannot subsequently be separated. Also, when a top-down algorithm separates two elements, they can't be rejoined. Such greedy property may lead to premature convergence and consequently lead to a clustering that is far from optimal. To overcome this problem, we propose a new Stochastic Message Passing Clustering (SMPC) method based on the Message Passing Clustering (MPC) algorithm introduced in our earlier work. SMPC, as a generalized version of MPC, extends the clustering algorithm from a deterministic process to a stochastic process, adding two major advantages. First, in deciding the merging cluster pair, the influences of all clusters are quantified by probabilities, estimated by kernel functions based on their relative distances. Secondly, clustering can be undone to improve the clustering performance when the algorithm detects elements which don't have good probabilities inside the cluster and moves them outside. The test results on colon cancer gene-expression data show that SMPC performs better than the deterministic MPC or hierarchical clustering method.