{"title":"基于投票的q-fold交叉验证共识聚类方法","authors":"Norin Rahayu Shamsuddin, N. Mahat","doi":"10.1285/I20705948V12N3P657","DOIUrl":null,"url":null,"abstract":"Over the past 50 years, extensive research have been carried out to understand how clustering work in classifying data into meaningful groups. Various clustering algorithms and cluster validity indexes have been proposedand improvised to obtain the best clustering result. However, there is noclustering method that is able to give consistent results on similar structureof a dataset. An alternative mechanism to control the variation of resultsand improved the quality of traditional clustering is through consensus clustering. In this paper, we generate multiple partitions of consensus clusteringthrough a resampling method by employing q-fold cross-validation approach.q-fold cross-validation approach is able to speed-up the consensus partitionsprocedure with qth iterations. To encounter with different number of cluster labels occur in the partitions, we employed voting-based method in the second stage of consensus clustering to obtain optimal consensus partition.The performance of optimal consensus partitions is evaluated from Silhouetteplot","PeriodicalId":44770,"journal":{"name":"Electronic Journal of Applied Statistical Analysis","volume":"12 1","pages":""},"PeriodicalIF":0.6000,"publicationDate":"2019-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Voting-based Approach in Consensus Clustering through q-fold cross-validation\",\"authors\":\"Norin Rahayu Shamsuddin, N. Mahat\",\"doi\":\"10.1285/I20705948V12N3P657\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Over the past 50 years, extensive research have been carried out to understand how clustering work in classifying data into meaningful groups. Various clustering algorithms and cluster validity indexes have been proposedand improvised to obtain the best clustering result. However, there is noclustering method that is able to give consistent results on similar structureof a dataset. An alternative mechanism to control the variation of resultsand improved the quality of traditional clustering is through consensus clustering. In this paper, we generate multiple partitions of consensus clusteringthrough a resampling method by employing q-fold cross-validation approach.q-fold cross-validation approach is able to speed-up the consensus partitionsprocedure with qth iterations. To encounter with different number of cluster labels occur in the partitions, we employed voting-based method in the second stage of consensus clustering to obtain optimal consensus partition.The performance of optimal consensus partitions is evaluated from Silhouetteplot\",\"PeriodicalId\":44770,\"journal\":{\"name\":\"Electronic Journal of Applied Statistical Analysis\",\"volume\":\"12 1\",\"pages\":\"\"},\"PeriodicalIF\":0.6000,\"publicationDate\":\"2019-11-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Electronic Journal of Applied Statistical Analysis\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1285/I20705948V12N3P657\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"STATISTICS & PROBABILITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Electronic Journal of Applied Statistical Analysis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1285/I20705948V12N3P657","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
Voting-based Approach in Consensus Clustering through q-fold cross-validation
Over the past 50 years, extensive research have been carried out to understand how clustering work in classifying data into meaningful groups. Various clustering algorithms and cluster validity indexes have been proposedand improvised to obtain the best clustering result. However, there is noclustering method that is able to give consistent results on similar structureof a dataset. An alternative mechanism to control the variation of resultsand improved the quality of traditional clustering is through consensus clustering. In this paper, we generate multiple partitions of consensus clusteringthrough a resampling method by employing q-fold cross-validation approach.q-fold cross-validation approach is able to speed-up the consensus partitionsprocedure with qth iterations. To encounter with different number of cluster labels occur in the partitions, we employed voting-based method in the second stage of consensus clustering to obtain optimal consensus partition.The performance of optimal consensus partitions is evaluated from Silhouetteplot