{"title":"Research on K Nearest Neighbor Non-parametric Regression Algorithm Based on KD-Tree and Clustering Analysis","authors":"Zheng-Wu Yuan, Yuan-Hui Wang","doi":"10.1109/ICCIS.2012.246","DOIUrl":null,"url":null,"abstract":"Regarding to the limitations of the existing K nearest neighbor non-parametric regression methods, spatial autocorrelation analysis is used to determine the state vector in this paper. In order to improve the speed of searching data, this paper uses the method of clipping samples to reduce data storage and split the sample quickly by KD-Tree. It also reduces the search volume of the nearest neighbor through the pruning principle of KD-Tree, gets the subset by proportional sampling in the KD-Tree subset, and runs K-Means clustering multiple times. Then the optimal K value is selected which can improve the forecast error of the uniform K value on the traditional non-parametric regression. The experimental results show that improved forecasting method is superior to the traditional method.","PeriodicalId":269967,"journal":{"name":"2012 Fourth International Conference on Computational and Information Sciences","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2012-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 Fourth International Conference on Computational and Information Sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCIS.2012.246","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10
Abstract
Regarding to the limitations of the existing K nearest neighbor non-parametric regression methods, spatial autocorrelation analysis is used to determine the state vector in this paper. In order to improve the speed of searching data, this paper uses the method of clipping samples to reduce data storage and split the sample quickly by KD-Tree. It also reduces the search volume of the nearest neighbor through the pruning principle of KD-Tree, gets the subset by proportional sampling in the KD-Tree subset, and runs K-Means clustering multiple times. Then the optimal K value is selected which can improve the forecast error of the uniform K value on the traditional non-parametric regression. The experimental results show that improved forecasting method is superior to the traditional method.