Lutfi Sivana Ihzaniah, Adi Setiawan, R. W. N. Wijaya
{"title":"Perbandingan Kinerja Metode Regresi K-Nearest Neighbor dan Metode Regresi Linear Berganda pada Data Boston Housing","authors":"Lutfi Sivana Ihzaniah, Adi Setiawan, R. W. N. Wijaya","doi":"10.34312/jjps.v4i1.18948","DOIUrl":null,"url":null,"abstract":"This research was made in order to see which method performance is better between the KNN (K-Nearest Neighbor) regression method and the multiple linear regression method on Boston Housing data. The method performace referred here is MAE, RMSE, MAPE, and R2. The KNN method is a method to predict something based on the closest training examples of an object. Meanwhile, multiple linear regression is a forecasting technique involving more than one independent variable. The comparison of the two methods is based on the results of the Mean Absolute Percent Error (MAPE). In this research the definitions of distance used are Euclidean distance and Minkowski distance. The K value in the KNN method defines the number of nearest neighbors to be examined to determine the value of a dependent variable, in this research we use K values from 1 to 10 for each test data and definition of distance. In this research, the percentage of test data used was 20%, 30%, and 40% for both methods. The best MAPE value obtained by the KNN regression method was 12,89% at K = 3 for Euclidean distance and 13,22% at K = 3 for Minkowski distance. Meanwhile the best MAPE value for the multiple linear regression method is 17,17%. The best method between the two methods is the KNN regression method as seen from the MAPE value of the KNN regression method which is smaller than the MAPE value of the multiple linear regression method.","PeriodicalId":315674,"journal":{"name":"Jambura Journal of Probability and Statistics","volume":"159 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Jambura Journal of Probability and Statistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.34312/jjps.v4i1.18948","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This research was made in order to see which method performance is better between the KNN (K-Nearest Neighbor) regression method and the multiple linear regression method on Boston Housing data. The method performace referred here is MAE, RMSE, MAPE, and R2. The KNN method is a method to predict something based on the closest training examples of an object. Meanwhile, multiple linear regression is a forecasting technique involving more than one independent variable. The comparison of the two methods is based on the results of the Mean Absolute Percent Error (MAPE). In this research the definitions of distance used are Euclidean distance and Minkowski distance. The K value in the KNN method defines the number of nearest neighbors to be examined to determine the value of a dependent variable, in this research we use K values from 1 to 10 for each test data and definition of distance. In this research, the percentage of test data used was 20%, 30%, and 40% for both methods. The best MAPE value obtained by the KNN regression method was 12,89% at K = 3 for Euclidean distance and 13,22% at K = 3 for Minkowski distance. Meanwhile the best MAPE value for the multiple linear regression method is 17,17%. The best method between the two methods is the KNN regression method as seen from the MAPE value of the KNN regression method which is smaller than the MAPE value of the multiple linear regression method.