关于k-均值聚类的最优性

2013 IEEE International Workshop on Genomic Signal Processing and Statistics Pub Date : 2013-11-01 DOI:10.1109/GENSIPS.2013.6735934

Lori A. Dalton

{"title":"关于k-均值聚类的最优性","authors":"Lori A. Dalton","doi":"10.1109/GENSIPS.2013.6735934","DOIUrl":null,"url":null,"abstract":"Although it is typically accepted that cluster analysis is a subjective activity, without an objective framework it is impossible to understand, let alone guarantee, the predictive capacity of clustering. To address this, recent work utilizes random point process theory to develop a probabilistic theory of clustering. The theory fully parallels Bayes decision theory for classification: given a known underlying processes and specified cost function there exist Bayes clustering operators with minimum expected error. Clustering is hence transformed from a subjective activity to an objective operation. In this work, we present conditions under which the optimization function utilized in classical k-means clustering is optimal in the new Bayes clustering theory, and thus begin to understand this algorithm objectively.","PeriodicalId":336511,"journal":{"name":"2013 IEEE International Workshop on Genomic Signal Processing and Statistics","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"On the optimality of k-means clustering\",\"authors\":\"Lori A. Dalton\",\"doi\":\"10.1109/GENSIPS.2013.6735934\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Although it is typically accepted that cluster analysis is a subjective activity, without an objective framework it is impossible to understand, let alone guarantee, the predictive capacity of clustering. To address this, recent work utilizes random point process theory to develop a probabilistic theory of clustering. The theory fully parallels Bayes decision theory for classification: given a known underlying processes and specified cost function there exist Bayes clustering operators with minimum expected error. Clustering is hence transformed from a subjective activity to an objective operation. In this work, we present conditions under which the optimization function utilized in classical k-means clustering is optimal in the new Bayes clustering theory, and thus begin to understand this algorithm objectively.\",\"PeriodicalId\":336511,\"journal\":{\"name\":\"2013 IEEE International Workshop on Genomic Signal Processing and Statistics\",\"volume\":\"19 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 IEEE International Workshop on Genomic Signal Processing and Statistics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/GENSIPS.2013.6735934\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 IEEE International Workshop on Genomic Signal Processing and Statistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/GENSIPS.2013.6735934","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

摘要

虽然人们通常认为聚类分析是一种主观活动，但如果没有客观的框架，就不可能理解聚类的预测能力，更不用说保证了。为了解决这个问题，最近的工作利用随机点过程理论来发展聚类的概率理论。该理论与贝叶斯分类决策理论完全相似:给定已知的底层过程和指定的成本函数，存在期望误差最小的贝叶斯聚类算子。因此，聚类从一种主观活动转变为一种客观操作。在这项工作中，我们提出了在新的贝叶斯聚类理论中，经典k-means聚类所使用的优化函数是最优的条件，从而开始客观地理解该算法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

On the optimality of k-means clustering

Although it is typically accepted that cluster analysis is a subjective activity, without an objective framework it is impossible to understand, let alone guarantee, the predictive capacity of clustering. To address this, recent work utilizes random point process theory to develop a probabilistic theory of clustering. The theory fully parallels Bayes decision theory for classification: given a known underlying processes and specified cost function there exist Bayes clustering operators with minimum expected error. Clustering is hence transformed from a subjective activity to an objective operation. In this work, we present conditions under which the optimization function utilized in classical k-means clustering is optimal in the new Bayes clustering theory, and thus begin to understand this algorithm objectively.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2013 IEEE International Workshop on Genomic Signal Processing and Statistics

自引率

0.00%

发文量