基于均值移位的改进K-Means算法及其实现

2018 11th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI) Pub Date : 2018-10-01 DOI:10.1109/CISP-BMEI.2018.8633100

Yang Chen, Pengfei Hu, Weilan Wang

{"title":"基于均值移位的改进K-Means算法及其实现","authors":"Yang Chen, Pengfei Hu, Weilan Wang","doi":"10.1109/CISP-BMEI.2018.8633100","DOIUrl":null,"url":null,"abstract":"The traditional K-means algorithm is sensitive to the initial clustering center, and randomly selecting different initial clustering centers will result in different clustering results. In this paper, an improved K-means algorithm based on Mean Shift clustering is proposed to solve the existing problems of the K-means algorithm. This algorithm selects a high-density migration vector set MP by Mean Shift, and selects k points with the farthest distance from each other in the high-density region in MP as the initial cluster center. This paper adopts the iris data set and the wine data set from the international standard UCI database, and 150 vowel image texts on the upper part of the baseline for the text analysis of the Ujin body Tibetan ancient books are used to verify the proposed algorithm (The real sample is called the Tibetan dataset). It can be seen from the experimental results that the algorithm can achieve better clustering results with higher accuracy and more stability.","PeriodicalId":117227,"journal":{"name":"2018 11th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI)","volume":"34 2","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Improved K-Means Algorithm and its Implementation Based on Mean Shift\",\"authors\":\"Yang Chen, Pengfei Hu, Weilan Wang\",\"doi\":\"10.1109/CISP-BMEI.2018.8633100\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The traditional K-means algorithm is sensitive to the initial clustering center, and randomly selecting different initial clustering centers will result in different clustering results. In this paper, an improved K-means algorithm based on Mean Shift clustering is proposed to solve the existing problems of the K-means algorithm. This algorithm selects a high-density migration vector set MP by Mean Shift, and selects k points with the farthest distance from each other in the high-density region in MP as the initial cluster center. This paper adopts the iris data set and the wine data set from the international standard UCI database, and 150 vowel image texts on the upper part of the baseline for the text analysis of the Ujin body Tibetan ancient books are used to verify the proposed algorithm (The real sample is called the Tibetan dataset). It can be seen from the experimental results that the algorithm can achieve better clustering results with higher accuracy and more stability.\",\"PeriodicalId\":117227,\"journal\":{\"name\":\"2018 11th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI)\",\"volume\":\"34 2\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 11th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CISP-BMEI.2018.8633100\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 11th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CISP-BMEI.2018.8633100","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

传统的K-means算法对初始聚类中心比较敏感，随机选择不同的初始聚类中心会导致不同的聚类结果。本文针对K-means算法存在的问题，提出了一种基于Mean Shift聚类的改进K-means算法。该算法通过Mean Shift选择高密度迁移向量集MP，选取MP中高密度区域中彼此距离最远的k个点作为初始聚类中心。本文采用国际标准UCI数据库中的虹膜数据集和酒数据集，并使用乌津体藏文古籍文本分析基线上方的150个元音图像文本来验证所提出的算法(真实样本称为藏文数据集)。从实验结果可以看出，该算法可以获得更好的聚类结果，具有更高的准确率和更强的稳定性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Improved K-Means Algorithm and its Implementation Based on Mean Shift

The traditional K-means algorithm is sensitive to the initial clustering center, and randomly selecting different initial clustering centers will result in different clustering results. In this paper, an improved K-means algorithm based on Mean Shift clustering is proposed to solve the existing problems of the K-means algorithm. This algorithm selects a high-density migration vector set MP by Mean Shift, and selects k points with the farthest distance from each other in the high-density region in MP as the initial cluster center. This paper adopts the iris data set and the wine data set from the international standard UCI database, and 150 vowel image texts on the upper part of the baseline for the text analysis of the Ujin body Tibetan ancient books are used to verify the proposed algorithm (The real sample is called the Tibetan dataset). It can be seen from the experimental results that the algorithm can achieve better clustering results with higher accuracy and more stability.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2018 11th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI)

自引率

0.00%

发文量