Performance Analysis of Hard and Soft Clustering Approaches For Gene Expression Data

P. K. N. Banu, S. Andrews
{"title":"Performance Analysis of Hard and Soft Clustering Approaches For Gene Expression Data","authors":"P. K. N. Banu, S. Andrews","doi":"10.4018/ijrsda.2015010104","DOIUrl":null,"url":null,"abstract":"Mining gene expression data is growing rapidly to predict gene expression patterns and assist clinicians in early diagnosis of tumor formation. Clustering gene expression data is the most important phase, helps in finding group of genes that are highly expressed and suppressed. This paper analyses the performance of most representative hard and soft off-line clustering algorithms: K-Means, Fuzzy C-Means, Self Organizing Maps SOM based clustering and Genetic Algorithm GA based clustering for brain tumor gene expression dataset. Clusters produced by the clustering algorithms are the indications of the cellular processes. Clustering results are evaluated using clustering indices such as Xie-Beni index XB, Davies-Bouldin index DB, Mean Absolute Error MAE, Root Mean Squared Error RMSE and Dunn's Index DI along with the time taken to find the compactness and separation of clusters. Experimental results prove soft clustering approaches works well to predict clusters of highly expressed and suppressed genes.","PeriodicalId":152357,"journal":{"name":"Int. J. Rough Sets Data Anal.","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"33","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Int. J. Rough Sets Data Anal.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4018/ijrsda.2015010104","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 33

Abstract

Mining gene expression data is growing rapidly to predict gene expression patterns and assist clinicians in early diagnosis of tumor formation. Clustering gene expression data is the most important phase, helps in finding group of genes that are highly expressed and suppressed. This paper analyses the performance of most representative hard and soft off-line clustering algorithms: K-Means, Fuzzy C-Means, Self Organizing Maps SOM based clustering and Genetic Algorithm GA based clustering for brain tumor gene expression dataset. Clusters produced by the clustering algorithms are the indications of the cellular processes. Clustering results are evaluated using clustering indices such as Xie-Beni index XB, Davies-Bouldin index DB, Mean Absolute Error MAE, Root Mean Squared Error RMSE and Dunn's Index DI along with the time taken to find the compactness and separation of clusters. Experimental results prove soft clustering approaches works well to predict clusters of highly expressed and suppressed genes.
基因表达数据的软硬聚类性能分析
基因表达数据的挖掘正在迅速发展,以预测基因表达模式并协助临床医生早期诊断肿瘤的形成。基因表达数据聚类是最重要的阶段,有助于发现高表达和抑制的基因组。本文分析了最具代表性的硬、软离线聚类算法:K-Means、模糊C-Means、基于自组织映射SOM的聚类和基于遗传算法GA的聚类对脑肿瘤基因表达数据集的性能。由聚类算法产生的聚类是细胞过程的指示。利用Xie-Beni指数XB、Davies-Bouldin指数DB、平均绝对误差MAE、均方根误差RMSE和Dunn指数DI等聚类指标,以及寻找聚类紧密度和分离度所需的时间,对聚类结果进行评价。实验结果表明,软聚类方法可以很好地预测高表达和抑制基因的聚类。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信