基于聚类算法改进基因表达数据分类的证据积累

Ranjita Das, S. Saha
{"title":"基于聚类算法改进基因表达数据分类的证据积累","authors":"Ranjita Das, S. Saha","doi":"10.1109/ISCMI.2016.54","DOIUrl":null,"url":null,"abstract":"The idea of ensemble based clustering is to combine the data partitions produced by multiple clustering algorithms. Here we have considered several recently developed clustering algorithms like point symmetry distance based genetic clustering technique (GAPS), symmetry based differential evolution and particle swarm optimization based clustering algorithms, popular K-means and fuzzy C-means clustering algorithms as the basic approaches for the generation of multiple clustering solutions. Here those basic algorithms perform the decomposition of initial N X d-dimensional data into k compact clusters. The objective of the use of ensemble clustering to get a single combined solution from the set of different individual partitionings is to increase the accuracy of final partitioning. Here the evidence on pattern association is accumulated by a Link based ensemble method called CTS. This produces a mapping of the partitioning into a N X N matrix that represents new similarity measure between patterns. The final data partition is obtained by applying the single-linkage clustering algorithm using this new similarity matrix. For experimental purpose some publicly available gene expression datasets have been used. Moreover to validate the clustering solutions obtained from the link based cluster ensemble method as well as from the individual base clustering algorithms, some internal cluster validity indices, DB-index and DUNN-index have been used.","PeriodicalId":417057,"journal":{"name":"2016 3rd International Conference on Soft Computing & Machine Intelligence (ISCMI)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Evidence Accumulation from Some Clustering Algorithms to Improve Gene Expression Data Classification\",\"authors\":\"Ranjita Das, S. Saha\",\"doi\":\"10.1109/ISCMI.2016.54\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The idea of ensemble based clustering is to combine the data partitions produced by multiple clustering algorithms. Here we have considered several recently developed clustering algorithms like point symmetry distance based genetic clustering technique (GAPS), symmetry based differential evolution and particle swarm optimization based clustering algorithms, popular K-means and fuzzy C-means clustering algorithms as the basic approaches for the generation of multiple clustering solutions. Here those basic algorithms perform the decomposition of initial N X d-dimensional data into k compact clusters. The objective of the use of ensemble clustering to get a single combined solution from the set of different individual partitionings is to increase the accuracy of final partitioning. Here the evidence on pattern association is accumulated by a Link based ensemble method called CTS. This produces a mapping of the partitioning into a N X N matrix that represents new similarity measure between patterns. The final data partition is obtained by applying the single-linkage clustering algorithm using this new similarity matrix. For experimental purpose some publicly available gene expression datasets have been used. Moreover to validate the clustering solutions obtained from the link based cluster ensemble method as well as from the individual base clustering algorithms, some internal cluster validity indices, DB-index and DUNN-index have been used.\",\"PeriodicalId\":417057,\"journal\":{\"name\":\"2016 3rd International Conference on Soft Computing & Machine Intelligence (ISCMI)\",\"volume\":\"34 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 3rd International Conference on Soft Computing & Machine Intelligence (ISCMI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISCMI.2016.54\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 3rd International Conference on Soft Computing & Machine Intelligence (ISCMI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISCMI.2016.54","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

基于集成的聚类思想是将多种聚类算法产生的数据分区组合在一起。在这里,我们考虑了最近发展的几种聚类算法,如基于点对称距离的遗传聚类技术(GAPS)、基于对称差分进化和基于粒子群优化的聚类算法、流行的K-means和模糊C-means聚类算法,作为生成多聚类解的基本方法。在这里,这些基本算法将初始N X d维数据分解为k个紧簇。使用集成聚类从不同的单个分区集合中获得单个组合解的目的是提高最终分区的准确性。在这里,模式关联的证据是通过一种称为CTS的基于链接的集成方法来积累的。这将生成划分到N X N矩阵的映射,该矩阵表示模式之间的新相似性度量。利用新的相似度矩阵,应用单链接聚类算法得到最终的数据分区。为了实验目的,使用了一些公开可用的基因表达数据集。此外,为了验证基于链接的聚类集成方法和单个基聚类算法得到的聚类解,还使用了一些内部聚类有效性指标,DB-index和DUNN-index。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Evidence Accumulation from Some Clustering Algorithms to Improve Gene Expression Data Classification
The idea of ensemble based clustering is to combine the data partitions produced by multiple clustering algorithms. Here we have considered several recently developed clustering algorithms like point symmetry distance based genetic clustering technique (GAPS), symmetry based differential evolution and particle swarm optimization based clustering algorithms, popular K-means and fuzzy C-means clustering algorithms as the basic approaches for the generation of multiple clustering solutions. Here those basic algorithms perform the decomposition of initial N X d-dimensional data into k compact clusters. The objective of the use of ensemble clustering to get a single combined solution from the set of different individual partitionings is to increase the accuracy of final partitioning. Here the evidence on pattern association is accumulated by a Link based ensemble method called CTS. This produces a mapping of the partitioning into a N X N matrix that represents new similarity measure between patterns. The final data partition is obtained by applying the single-linkage clustering algorithm using this new similarity matrix. For experimental purpose some publicly available gene expression datasets have been used. Moreover to validate the clustering solutions obtained from the link based cluster ensemble method as well as from the individual base clustering algorithms, some internal cluster validity indices, DB-index and DUNN-index have been used.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信