{"title":"Improved KNN Algorithm based on Probability and Adaptive K Value","authors":"Yulong Ling, Xiao Zhang, Yong Zhang","doi":"10.1145/3456172.3456201","DOIUrl":null,"url":null,"abstract":"As one of the most classical supervised learning algorithms, the KNN algorithm is not only easy to understand but also can solve classification problems very well. Nevertheless, the KNN algorithm has a serious drawback:The voting principle used to predict the category of samples to be classified is too simple and does not take into account the proximity of the number of samples contained in each category in k near-neighbor samples. To solve this problem, this paper proposes a novel decision strategy based on probability and iterative k value to improve the KNN algorithm. By constantly adjusting the value of k to bring the probability value of the largest class in the k neighborhood to the specified threshold, the decision is sufficiently persuasive. The experimental results on several UCI public data sets show that compared with the KNN algorithm and the distance-weighted KNN algorithm, the improved algorithm in this paper improves the classification accuracy while reducing the sensitivity to hyperparameter k to a certain extent.","PeriodicalId":133908,"journal":{"name":"Proceedings of the 2021 7th International Conference on Computing and Data Engineering","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2021 7th International Conference on Computing and Data Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3456172.3456201","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
As one of the most classical supervised learning algorithms, the KNN algorithm is not only easy to understand but also can solve classification problems very well. Nevertheless, the KNN algorithm has a serious drawback:The voting principle used to predict the category of samples to be classified is too simple and does not take into account the proximity of the number of samples contained in each category in k near-neighbor samples. To solve this problem, this paper proposes a novel decision strategy based on probability and iterative k value to improve the KNN algorithm. By constantly adjusting the value of k to bring the probability value of the largest class in the k neighborhood to the specified threshold, the decision is sufficiently persuasive. The experimental results on several UCI public data sets show that compared with the KNN algorithm and the distance-weighted KNN algorithm, the improved algorithm in this paper improves the classification accuracy while reducing the sensitivity to hyperparameter k to a certain extent.