Cluster Validation by Measurement of Clustering Characteristics Relevant to the User

C. Hennig
{"title":"Cluster Validation by Measurement of Clustering Characteristics Relevant to the User","authors":"C. Hennig","doi":"10.1002/9781119597568.CH1","DOIUrl":null,"url":null,"abstract":"There are many cluster analysis methods that can produce quite different clusterings on the same dataset. Cluster validation is about the evaluation of the quality of a clustering; \"relative cluster validation\" is about using such criteria to compare clusterings. This can be used to select one of a set of clusterings from different methods, or from the same method ran with different parameters such as different numbers of clusters. \nThere are many cluster validation indexes in the literature. Most of them attempt to measure the overall quality of a clustering by a single number, but this can be inappropriate. There are various different characteristics of a clustering that can be relevant in practice, depending on the aim of clustering, such as low within-cluster distances and high between-cluster separation. \nIn this paper, a number of validation criteria will be introduced that refer to different desirable characteristics of a clustering, and that characterise a clustering in a multidimensional way. In specific applications the user may be interested in some of these criteria rather than others. A focus of the paper is on methodology to standardise the different characteristics so that users can aggregate them in a suitable way specifying weights for the various criteria that are relevant in the clustering application at hand.","PeriodicalId":320617,"journal":{"name":"Data Analysis and Applications 1","volume":"123 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"35","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data Analysis and Applications 1","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/9781119597568.CH1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 35

Abstract

There are many cluster analysis methods that can produce quite different clusterings on the same dataset. Cluster validation is about the evaluation of the quality of a clustering; "relative cluster validation" is about using such criteria to compare clusterings. This can be used to select one of a set of clusterings from different methods, or from the same method ran with different parameters such as different numbers of clusters. There are many cluster validation indexes in the literature. Most of them attempt to measure the overall quality of a clustering by a single number, but this can be inappropriate. There are various different characteristics of a clustering that can be relevant in practice, depending on the aim of clustering, such as low within-cluster distances and high between-cluster separation. In this paper, a number of validation criteria will be introduced that refer to different desirable characteristics of a clustering, and that characterise a clustering in a multidimensional way. In specific applications the user may be interested in some of these criteria rather than others. A focus of the paper is on methodology to standardise the different characteristics so that users can aggregate them in a suitable way specifying weights for the various criteria that are relevant in the clustering application at hand.
通过测量与用户相关的聚类特征进行聚类验证
有许多聚类分析方法可以在同一数据集上产生完全不同的聚类。聚类验证是对聚类质量的评价;“相对聚类验证”就是使用这样的标准来比较聚类。这可以用于从不同的方法中选择一组聚类,或者从使用不同参数(如不同数量的聚类)运行的相同方法中选择一组聚类。文献中有很多聚类验证指标。它们中的大多数都试图通过单个数字来衡量群集的总体质量,但这可能是不合适的。根据聚类的目的,聚类有各种不同的特征,这些特征在实践中可能是相关的,例如低簇内距离和高簇间分离。在本文中,将介绍一些验证标准,这些标准涉及聚类的不同期望特征,并且以多维方式表征聚类。在特定的应用程序中,用户可能对其中一些标准感兴趣,而不是其他标准。本文的重点是标准化不同特征的方法,以便用户可以以合适的方式对它们进行聚合,并为手头的聚类应用程序中相关的各种标准指定权重。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信