{"title":"一种基于边界信息的聚类算法","authors":"Junkun Zhong, Yuping Wang, Hui Du, Wuning Tong","doi":"10.1109/CEC.2018.8477697","DOIUrl":null,"url":null,"abstract":"In view of the shortcomings that many clustering algorithms such as K-means clustering algorithm are not suitable for the non-convex dataset and the Affinity Propagation (AP) algorithm may cluster two adjacent different class points into one class, we proposed a new clustering algorithm by using boundary information. The idea of the proposed algorithm in this paper is as follows: First, use the number of points contained in each point's neighborhood as its density, and consider the points whose density are below the average density as boundary points. Then, count the number of boundary points. If the number of boundary points is larger than a given threshold then clustering is carried out by transfer ideas directly, otherwise boundary points will be regarded as the cluster boundary wall. When the boundary points are encountered in the transitive clustering process, the transfer stopped and selected an unprocessed non-boundary point to start clustering process as above again until all non-boundary points are processed, so as to effectively prevent clustering two adjacent different class points into one class. Because of the clustering of transfer idea, the proposed algorithm is applicable to nonconvex datasets, and different clustering schemes are adopted according to the number of boundary points which increases the applicability of the algorithm. Experimental results on synthetic datasets and standard datasets show that the algorithm proposed in this paper is efficient.","PeriodicalId":212677,"journal":{"name":"2018 IEEE Congress on Evolutionary Computation (CEC)","volume":"32 8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A New Clustering Algorithm by Using Boundary Information\",\"authors\":\"Junkun Zhong, Yuping Wang, Hui Du, Wuning Tong\",\"doi\":\"10.1109/CEC.2018.8477697\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In view of the shortcomings that many clustering algorithms such as K-means clustering algorithm are not suitable for the non-convex dataset and the Affinity Propagation (AP) algorithm may cluster two adjacent different class points into one class, we proposed a new clustering algorithm by using boundary information. The idea of the proposed algorithm in this paper is as follows: First, use the number of points contained in each point's neighborhood as its density, and consider the points whose density are below the average density as boundary points. Then, count the number of boundary points. If the number of boundary points is larger than a given threshold then clustering is carried out by transfer ideas directly, otherwise boundary points will be regarded as the cluster boundary wall. When the boundary points are encountered in the transitive clustering process, the transfer stopped and selected an unprocessed non-boundary point to start clustering process as above again until all non-boundary points are processed, so as to effectively prevent clustering two adjacent different class points into one class. Because of the clustering of transfer idea, the proposed algorithm is applicable to nonconvex datasets, and different clustering schemes are adopted according to the number of boundary points which increases the applicability of the algorithm. Experimental results on synthetic datasets and standard datasets show that the algorithm proposed in this paper is efficient.\",\"PeriodicalId\":212677,\"journal\":{\"name\":\"2018 IEEE Congress on Evolutionary Computation (CEC)\",\"volume\":\"32 8 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 IEEE Congress on Evolutionary Computation (CEC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CEC.2018.8477697\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE Congress on Evolutionary Computation (CEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CEC.2018.8477697","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A New Clustering Algorithm by Using Boundary Information
In view of the shortcomings that many clustering algorithms such as K-means clustering algorithm are not suitable for the non-convex dataset and the Affinity Propagation (AP) algorithm may cluster two adjacent different class points into one class, we proposed a new clustering algorithm by using boundary information. The idea of the proposed algorithm in this paper is as follows: First, use the number of points contained in each point's neighborhood as its density, and consider the points whose density are below the average density as boundary points. Then, count the number of boundary points. If the number of boundary points is larger than a given threshold then clustering is carried out by transfer ideas directly, otherwise boundary points will be regarded as the cluster boundary wall. When the boundary points are encountered in the transitive clustering process, the transfer stopped and selected an unprocessed non-boundary point to start clustering process as above again until all non-boundary points are processed, so as to effectively prevent clustering two adjacent different class points into one class. Because of the clustering of transfer idea, the proposed algorithm is applicable to nonconvex datasets, and different clustering schemes are adopted according to the number of boundary points which increases the applicability of the algorithm. Experimental results on synthetic datasets and standard datasets show that the algorithm proposed in this paper is efficient.