基于快速均匀分割的分层聚类算法

2022 12th International Conference on Cloud Computing, Data Science & Engineering (Confluence) Pub Date : 2022-01-27 DOI:10.1109/confluence52989.2022.9734143

Xiaojun Wu, Jingjing Wei, Sheng Yuan, Zihong Chen, Xiaochun Wang

{"title":"基于快速均匀分割的分层聚类算法","authors":"Xiaojun Wu, Jingjing Wei, Sheng Yuan, Zihong Chen, Xiaochun Wang","doi":"10.1109/confluence52989.2022.9734143","DOIUrl":null,"url":null,"abstract":"Hierarchical clustering algorithm is a very important method in data mining. The disadvantage of hierarchical clustering lies in the time complexity of the algorithm and the one-way irreversibility of the algorithm. The inaccuracy of the conditions for cluster termination is another major disadvantage of hierarchical clustering. Hierarchical clustering requires the final cluster number. But for most datasets, the number of clusters cannot be known in advance. Therefore, a method is proposed to combine the split-based and agglomeration-based hierarchical clustering algorithms to first quickly and uniformly partition the original dataset, and then make similar partitions adaptively merge based on the partition density and partition distance on the basis of these partitions. In this paper, aiming at these defects of hierarchical clustering, a hierarchical clustering algorithm based on fast and uniform segmentation is proposed.","PeriodicalId":261941,"journal":{"name":"2022 12th International Conference on Cloud Computing, Data Science & Engineering (Confluence)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Hierarchical Clustering Algorithm Based on Fast and Uniform Segmentation\",\"authors\":\"Xiaojun Wu, Jingjing Wei, Sheng Yuan, Zihong Chen, Xiaochun Wang\",\"doi\":\"10.1109/confluence52989.2022.9734143\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Hierarchical clustering algorithm is a very important method in data mining. The disadvantage of hierarchical clustering lies in the time complexity of the algorithm and the one-way irreversibility of the algorithm. The inaccuracy of the conditions for cluster termination is another major disadvantage of hierarchical clustering. Hierarchical clustering requires the final cluster number. But for most datasets, the number of clusters cannot be known in advance. Therefore, a method is proposed to combine the split-based and agglomeration-based hierarchical clustering algorithms to first quickly and uniformly partition the original dataset, and then make similar partitions adaptively merge based on the partition density and partition distance on the basis of these partitions. In this paper, aiming at these defects of hierarchical clustering, a hierarchical clustering algorithm based on fast and uniform segmentation is proposed.\",\"PeriodicalId\":261941,\"journal\":{\"name\":\"2022 12th International Conference on Cloud Computing, Data Science & Engineering (Confluence)\",\"volume\":\"13 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-01-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 12th International Conference on Cloud Computing, Data Science & Engineering (Confluence)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/confluence52989.2022.9734143\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 12th International Conference on Cloud Computing, Data Science & Engineering (Confluence)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/confluence52989.2022.9734143","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

分层聚类算法是数据挖掘中的一种重要方法。分层聚类的缺点在于算法的时间复杂度和算法的单向不可逆性。集群终止条件的不准确性是分层聚类的另一个主要缺点。分层聚类需要最终的簇号。但是对于大多数数据集，不能提前知道簇的数量。为此，提出了一种将基于分裂和基于聚类的分层聚类算法相结合的方法，首先对原始数据集进行快速、均匀的分区，然后在这些分区的基础上，根据分区密度和分区距离对相似的分区进行自适应合并。本文针对分层聚类的这些缺陷，提出了一种基于快速均匀分割的分层聚类算法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Hierarchical Clustering Algorithm Based on Fast and Uniform Segmentation

Hierarchical clustering algorithm is a very important method in data mining. The disadvantage of hierarchical clustering lies in the time complexity of the algorithm and the one-way irreversibility of the algorithm. The inaccuracy of the conditions for cluster termination is another major disadvantage of hierarchical clustering. Hierarchical clustering requires the final cluster number. But for most datasets, the number of clusters cannot be known in advance. Therefore, a method is proposed to combine the split-based and agglomeration-based hierarchical clustering algorithms to first quickly and uniformly partition the original dataset, and then make similar partitions adaptively merge based on the partition density and partition distance on the basis of these partitions. In this paper, aiming at these defects of hierarchical clustering, a hierarchical clustering algorithm based on fast and uniform segmentation is proposed.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 12th International Conference on Cloud Computing, Data Science & Engineering (Confluence)

自引率

0.00%

发文量