mmkk++算法用于将异构图像聚类到未知数量的聚类中

Q4 Computer Science
Dávid Papp, G. Szűcs
{"title":"mmkk++算法用于将异构图像聚类到未知数量的聚类中","authors":"Dávid Papp, G. Szűcs","doi":"10.5565/REV/ELCVIA.1054","DOIUrl":null,"url":null,"abstract":"In this paper we present an automatic clustering procedure with the main aim to predict the number of clusters of unknown, heterogeneous images. We used the Fisher-vector for mathematical representation of the images and these vectors were considered as input data points for the clustering algorithm. We implemented a novel variant of K-means, the kernel K-means++, furthermore the min-max kernel K-means plusplus (MMKK++) as clustering method. The proposed approach examines some candidate cluster numbers and determines the strength of the clustering to estimate how well the data fit into K clusters, as well as the law of large numbers was used in order to choose the optimal cluster size. We conducted experiments on four image sets to demonstrate the efficiency of our solution. The first two image sets are subsets of different popular collections; the third is their union; the fourth is the complete Caltech101 image set. The result showed that our approach was able to give a better estimation for the number of clusters than the competitor methods. Furthermore, we defined two new metrics for evaluation of predicting the appropriate cluster number, which are capable of measuring the goodness in a more sophisticated way, instead of binary evaluation.","PeriodicalId":38711,"journal":{"name":"Electronic Letters on Computer Vision and Image Analysis","volume":"9 1","pages":"30-45"},"PeriodicalIF":0.0000,"publicationDate":"2018-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"MMKK++ algorithm for clustering heterogeneous images into an unknown number of clusters\",\"authors\":\"Dávid Papp, G. Szűcs\",\"doi\":\"10.5565/REV/ELCVIA.1054\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper we present an automatic clustering procedure with the main aim to predict the number of clusters of unknown, heterogeneous images. We used the Fisher-vector for mathematical representation of the images and these vectors were considered as input data points for the clustering algorithm. We implemented a novel variant of K-means, the kernel K-means++, furthermore the min-max kernel K-means plusplus (MMKK++) as clustering method. The proposed approach examines some candidate cluster numbers and determines the strength of the clustering to estimate how well the data fit into K clusters, as well as the law of large numbers was used in order to choose the optimal cluster size. We conducted experiments on four image sets to demonstrate the efficiency of our solution. The first two image sets are subsets of different popular collections; the third is their union; the fourth is the complete Caltech101 image set. The result showed that our approach was able to give a better estimation for the number of clusters than the competitor methods. Furthermore, we defined two new metrics for evaluation of predicting the appropriate cluster number, which are capable of measuring the goodness in a more sophisticated way, instead of binary evaluation.\",\"PeriodicalId\":38711,\"journal\":{\"name\":\"Electronic Letters on Computer Vision and Image Analysis\",\"volume\":\"9 1\",\"pages\":\"30-45\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-01-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Electronic Letters on Computer Vision and Image Analysis\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5565/REV/ELCVIA.1054\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"Computer Science\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Electronic Letters on Computer Vision and Image Analysis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5565/REV/ELCVIA.1054","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"Computer Science","Score":null,"Total":0}
引用次数: 4

摘要

在本文中,我们提出了一个自动聚类过程,其主要目的是预测未知的异构图像的聚类数量。我们使用fisher向量对图像进行数学表示,这些向量被认为是聚类算法的输入数据点。我们实现了K-means的一种新变体,即核K-means++,并进一步实现了最小-最大核K-means++ (mmkk++)作为聚类方法。所提出的方法检查一些候选簇数,并确定聚类的强度,以估计数据适合K个簇的程度,以及使用大数定律来选择最佳簇大小。我们在四个图像集上进行了实验,以证明我们的解决方案的效率。前两个图像集是不同流行集合的子集;第三是他们的结合;第四是完整的Caltech101图像集。结果表明,我们的方法能够比竞争对手的方法给出更好的聚类数量估计。此外,我们定义了两个新的指标来评估预测适当的聚类数,它们能够以更复杂的方式衡量优度,而不是二元评估。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
MMKK++ algorithm for clustering heterogeneous images into an unknown number of clusters
In this paper we present an automatic clustering procedure with the main aim to predict the number of clusters of unknown, heterogeneous images. We used the Fisher-vector for mathematical representation of the images and these vectors were considered as input data points for the clustering algorithm. We implemented a novel variant of K-means, the kernel K-means++, furthermore the min-max kernel K-means plusplus (MMKK++) as clustering method. The proposed approach examines some candidate cluster numbers and determines the strength of the clustering to estimate how well the data fit into K clusters, as well as the law of large numbers was used in order to choose the optimal cluster size. We conducted experiments on four image sets to demonstrate the efficiency of our solution. The first two image sets are subsets of different popular collections; the third is their union; the fourth is the complete Caltech101 image set. The result showed that our approach was able to give a better estimation for the number of clusters than the competitor methods. Furthermore, we defined two new metrics for evaluation of predicting the appropriate cluster number, which are capable of measuring the goodness in a more sophisticated way, instead of binary evaluation.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Electronic Letters on Computer Vision and Image Analysis
Electronic Letters on Computer Vision and Image Analysis Computer Science-Computer Vision and Pattern Recognition
CiteScore
2.50
自引率
0.00%
发文量
19
审稿时长
12 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信