基于改进K-Means算法的大数据挖掘预测应用

2019 34rd Youth Academic Annual Conference of Chinese Association of Automation (YAC) Pub Date : 2019-06-06 DOI:10.1109/YAC.2019.8787670

Yuchen Qiao, Yunlu Li, Xiaotian Lv

{"title":"基于改进K-Means算法的大数据挖掘预测应用","authors":"Yuchen Qiao, Yunlu Li, Xiaotian Lv","doi":"10.1109/YAC.2019.8787670","DOIUrl":null,"url":null,"abstract":"In order to solve the problem of low efficiency of K-Means algorithm in processing the data mining prediction problem of big data with more attributes, an annual income prediction method of residents based on improved K-Means algorithm is proposed. The improved K-Means algorithm combines the principal component analysis method with the traditional K-Means algorithm. After reducing the dimensionality of various data attributes, the data are classified with K-Means algorithm. The research makes use of 1994 U.S. census database and conducts a contrastive analysis of the two algorithms. The results show that the prediction accuracy has been significantly improved by 13.3313%, from 53.1016% to 66.4329%. It is clear the improved algorithm can effectively improve the accuracy of clustering and annual income prediction.","PeriodicalId":6669,"journal":{"name":"2019 34rd Youth Academic Annual Conference of Chinese Association of Automation (YAC)","volume":"14 1","pages":"348-351"},"PeriodicalIF":0.0000,"publicationDate":"2019-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"The Application of Big Data Mining Prediction Based on Improved K-Means Algorithm\",\"authors\":\"Yuchen Qiao, Yunlu Li, Xiaotian Lv\",\"doi\":\"10.1109/YAC.2019.8787670\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In order to solve the problem of low efficiency of K-Means algorithm in processing the data mining prediction problem of big data with more attributes, an annual income prediction method of residents based on improved K-Means algorithm is proposed. The improved K-Means algorithm combines the principal component analysis method with the traditional K-Means algorithm. After reducing the dimensionality of various data attributes, the data are classified with K-Means algorithm. The research makes use of 1994 U.S. census database and conducts a contrastive analysis of the two algorithms. The results show that the prediction accuracy has been significantly improved by 13.3313%, from 53.1016% to 66.4329%. It is clear the improved algorithm can effectively improve the accuracy of clustering and annual income prediction.\",\"PeriodicalId\":6669,\"journal\":{\"name\":\"2019 34rd Youth Academic Annual Conference of Chinese Association of Automation (YAC)\",\"volume\":\"14 1\",\"pages\":\"348-351\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-06-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 34rd Youth Academic Annual Conference of Chinese Association of Automation (YAC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/YAC.2019.8787670\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 34rd Youth Academic Annual Conference of Chinese Association of Automation (YAC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/YAC.2019.8787670","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

为了解决K-Means算法在处理多属性大数据的数据挖掘预测问题时效率较低的问题，提出了一种基于改进K-Means算法的居民年收入预测方法。改进的K-Means算法将主成分分析法与传统的K-Means算法相结合。将各种数据属性降维后，采用K-Means算法对数据进行分类。本研究利用1994年美国人口普查数据库，对两种算法进行对比分析。结果表明，预测精度从53.1016%提高到66.4329%，显著提高了13.3313%。改进后的算法可以有效地提高聚类和年收入预测的精度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

The Application of Big Data Mining Prediction Based on Improved K-Means Algorithm

In order to solve the problem of low efficiency of K-Means algorithm in processing the data mining prediction problem of big data with more attributes, an annual income prediction method of residents based on improved K-Means algorithm is proposed. The improved K-Means algorithm combines the principal component analysis method with the traditional K-Means algorithm. After reducing the dimensionality of various data attributes, the data are classified with K-Means algorithm. The research makes use of 1994 U.S. census database and conducts a contrastive analysis of the two algorithms. The results show that the prediction accuracy has been significantly improved by 13.3313%, from 53.1016% to 66.4329%. It is clear the improved algorithm can effectively improve the accuracy of clustering and annual income prediction.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2019 34rd Youth Academic Annual Conference of Chinese Association of Automation (YAC)

自引率

0.00%

发文量