{"title":"An Empirical Scrutinization of Four Crisp Clustering Methods with Four Distance Metrics and One Straightforward Interpretation Rule","authors":"T. A. Alvandyan, S. Shalileh","doi":"10.1134/S1064562424602002","DOIUrl":null,"url":null,"abstract":"<p>Clustering has always been in great demand by scientific and industrial communities. However, due to the lack of ground truth, interpreting its obtained results can be debatable. The current research provides an empirical benchmark on the efficiency of three popular and one recently proposed crisp clustering methods. To this end, we extensively analyzed these (four) methods by applying them to nine real-world and 420 synthetic datasets using four different values of <i>p</i> in Minkowski distance. Furthermore, we validated a previously proposed yet not well-known straightforward rule to interpret the recovered clusters. Our computations showed (i) Nesterov gradient descent clustering is the most effective clustering method using our real-world data, while K-Means had edge over it using our synthetic data; (ii) Minkowski distance with <i>p</i> = 1 is the most effective distance function, (iii) the investigated cluster interpretation rule is intuitive and valid.</p>","PeriodicalId":531,"journal":{"name":"Doklady Mathematics","volume":"110 1 supplement","pages":"S236 - S250"},"PeriodicalIF":0.5000,"publicationDate":"2025-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1134/S1064562424602002.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Doklady Mathematics","FirstCategoryId":"100","ListUrlMain":"https://link.springer.com/article/10.1134/S1064562424602002","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MATHEMATICS","Score":null,"Total":0}
引用次数: 0
Abstract
Clustering has always been in great demand by scientific and industrial communities. However, due to the lack of ground truth, interpreting its obtained results can be debatable. The current research provides an empirical benchmark on the efficiency of three popular and one recently proposed crisp clustering methods. To this end, we extensively analyzed these (four) methods by applying them to nine real-world and 420 synthetic datasets using four different values of p in Minkowski distance. Furthermore, we validated a previously proposed yet not well-known straightforward rule to interpret the recovered clusters. Our computations showed (i) Nesterov gradient descent clustering is the most effective clustering method using our real-world data, while K-Means had edge over it using our synthetic data; (ii) Minkowski distance with p = 1 is the most effective distance function, (iii) the investigated cluster interpretation rule is intuitive and valid.
期刊介绍:
Doklady Mathematics is a journal of the Presidium of the Russian Academy of Sciences. It contains English translations of papers published in Doklady Akademii Nauk (Proceedings of the Russian Academy of Sciences), which was founded in 1933 and is published 36 times a year. Doklady Mathematics includes the materials from the following areas: mathematics, mathematical physics, computer science, control theory, and computers. It publishes brief scientific reports on previously unpublished significant new research in mathematics and its applications. The main contributors to the journal are Members of the RAS, Corresponding Members of the RAS, and scientists from the former Soviet Union and other foreign countries. Among the contributors are the outstanding Russian mathematicians.