Novel hybrid hierarchical-K-means clustering method (H-K-means) for microarray analysis

Bernard Chen, R. Harrison, Yi Pan, P. Tai
{"title":"Novel hybrid hierarchical-K-means clustering method (H-K-means) for microarray analysis","authors":"Bernard Chen, R. Harrison, Yi Pan, P. Tai","doi":"10.1109/CSBW.2005.98","DOIUrl":null,"url":null,"abstract":"Hierarchical and k-means clustering are two major analytical tools for unsupervised microarray datasets. However, both have their innate disadvantages. Hierarchical clustering cannot represent distinct clusters with similar expression patterns. Also, as clusters grow in size, the actual expression patterns become less relevant. K-means clustering requires a specified number of clusters in advance and chooses initial centroids randomly: in addition, it is sensitive to outliers. We present a novel hybrid approach to combined merits of the two and discard disadvantages we mentioned above. It is different from existed method: carry out hierarchical clustering first to decide location and number of clusters in the first round and run the K-means clustering in another round. The brief idea is we cluster around half data through hierarchical clustering and succeed by K-means for the rest half in one single round. Also, our approach provides a mechanism to handle outliers. Comparing with existed hybrid clustering approach and K-means clustering in 2 different distance measure on Eisen's yeast microarray data, our method always generate much higher quality clusters.","PeriodicalId":123531,"journal":{"name":"2005 IEEE Computational Systems Bioinformatics Conference - Workshops (CSBW'05)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2005-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"97","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2005 IEEE Computational Systems Bioinformatics Conference - Workshops (CSBW'05)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSBW.2005.98","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 97

Abstract

Hierarchical and k-means clustering are two major analytical tools for unsupervised microarray datasets. However, both have their innate disadvantages. Hierarchical clustering cannot represent distinct clusters with similar expression patterns. Also, as clusters grow in size, the actual expression patterns become less relevant. K-means clustering requires a specified number of clusters in advance and chooses initial centroids randomly: in addition, it is sensitive to outliers. We present a novel hybrid approach to combined merits of the two and discard disadvantages we mentioned above. It is different from existed method: carry out hierarchical clustering first to decide location and number of clusters in the first round and run the K-means clustering in another round. The brief idea is we cluster around half data through hierarchical clustering and succeed by K-means for the rest half in one single round. Also, our approach provides a mechanism to handle outliers. Comparing with existed hybrid clustering approach and K-means clustering in 2 different distance measure on Eisen's yeast microarray data, our method always generate much higher quality clusters.
微阵列分析的H-K-means混合聚类方法
分层聚类和k均值聚类是无监督微阵列数据集的两种主要分析工具。然而,两者都有其先天的缺点。分层聚类不能表示具有相似表达模式的不同聚类。此外,随着集群规模的增长,实际的表达模式变得不那么相关。K-means聚类需要预先确定一定数量的聚类,随机选择初始质心,对离群值敏感。我们提出了一种新的混合方法,结合了两者的优点,摒弃了上面提到的缺点。它不同于现有的方法:先进行分层聚类,第一轮确定聚类的位置和数量,再进行K-means聚类。简单的想法是,我们通过分层聚类对一半数据进行聚类,并在一轮中通过K-means成功地对剩下的一半进行聚类。此外,我们的方法提供了一种处理异常值的机制。与现有的混合聚类方法和2种不同距离度量下的K-means聚类方法相比,我们的方法总能得到更高质量的聚类。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信