{"title":"多变量数据的聚类分析:一种基于加权空间秩的方法","authors":"Mohammed H. Baragilly, Hend Gabr, Brian H. Willis","doi":"10.1155/2023/8849404","DOIUrl":null,"url":null,"abstract":"Determining the right number of clusters without any prior information about their numbers is a core problem in cluster analysis. In this paper, we propose a nonparametric clustering method based on different weighted spatial rank (WSR) functions. The main idea behind WSR is to define a dissimilarity measure locally based on a localized version of multivariate ranks. We consider a nonparametric Gaussian kernel weights function. We compare the performance of the method with other standard techniques and assess its misclassification rate. The method is completely data-driven, robust against distributional assumptions, and accurate for the purpose of intuitive visualization and can be used both to determine the number of clusters and assign each observation to its cluster.","PeriodicalId":44760,"journal":{"name":"Journal of Probability and Statistics","volume":"74 1","pages":"0"},"PeriodicalIF":1.0000,"publicationDate":"2023-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Clustering Analysis of Multivariate Data: A Weighted Spatial Ranks-Based Approach\",\"authors\":\"Mohammed H. Baragilly, Hend Gabr, Brian H. Willis\",\"doi\":\"10.1155/2023/8849404\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Determining the right number of clusters without any prior information about their numbers is a core problem in cluster analysis. In this paper, we propose a nonparametric clustering method based on different weighted spatial rank (WSR) functions. The main idea behind WSR is to define a dissimilarity measure locally based on a localized version of multivariate ranks. We consider a nonparametric Gaussian kernel weights function. We compare the performance of the method with other standard techniques and assess its misclassification rate. The method is completely data-driven, robust against distributional assumptions, and accurate for the purpose of intuitive visualization and can be used both to determine the number of clusters and assign each observation to its cluster.\",\"PeriodicalId\":44760,\"journal\":{\"name\":\"Journal of Probability and Statistics\",\"volume\":\"74 1\",\"pages\":\"0\"},\"PeriodicalIF\":1.0000,\"publicationDate\":\"2023-09-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Probability and Statistics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1155/2023/8849404\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"STATISTICS & PROBABILITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Probability and Statistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1155/2023/8849404","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
Clustering Analysis of Multivariate Data: A Weighted Spatial Ranks-Based Approach
Determining the right number of clusters without any prior information about their numbers is a core problem in cluster analysis. In this paper, we propose a nonparametric clustering method based on different weighted spatial rank (WSR) functions. The main idea behind WSR is to define a dissimilarity measure locally based on a localized version of multivariate ranks. We consider a nonparametric Gaussian kernel weights function. We compare the performance of the method with other standard techniques and assess its misclassification rate. The method is completely data-driven, robust against distributional assumptions, and accurate for the purpose of intuitive visualization and can be used both to determine the number of clusters and assign each observation to its cluster.