Perbandingan Kinerja Metode Regresi K-Nearest Neighbor dan Metode Regresi Linear Berganda pada Data Boston Housing

Lutfi Sivana Ihzaniah, Adi Setiawan, R. W. N. Wijaya
{"title":"Perbandingan Kinerja Metode Regresi K-Nearest Neighbor dan Metode Regresi Linear Berganda pada Data Boston Housing","authors":"Lutfi Sivana Ihzaniah, Adi Setiawan, R. W. N. Wijaya","doi":"10.34312/jjps.v4i1.18948","DOIUrl":null,"url":null,"abstract":"This research was made in order to see which method  performance is better between the KNN (K-Nearest Neighbor) regression method and the multiple linear regression method on Boston Housing data. The method performace referred here is MAE, RMSE, MAPE, and R2. The KNN method is a method to predict something based on the closest training examples of an object. Meanwhile, multiple linear regression is a forecasting technique involving more than one independent variable. The comparison of the two methods is based on the results of the Mean Absolute Percent Error (MAPE). In this research the definitions of distance used are Euclidean distance and Minkowski distance. The K value in the KNN method defines the number of nearest neighbors to be examined to determine the value of a dependent variable, in this research we use K values from 1 to 10 for each test data and definition of distance. In this research, the percentage of test data used was 20%, 30%, and 40% for both methods. The best MAPE value obtained by the KNN regression method was 12,89% at K = 3 for Euclidean distance and 13,22% at K = 3 for Minkowski distance. Meanwhile the best MAPE value for the multiple linear regression method is 17,17%. The best method between the two methods is the KNN regression method as seen from the MAPE value of the KNN regression method which is smaller than the MAPE value of the multiple linear regression method.","PeriodicalId":315674,"journal":{"name":"Jambura Journal of Probability and Statistics","volume":"159 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Jambura Journal of Probability and Statistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.34312/jjps.v4i1.18948","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

This research was made in order to see which method  performance is better between the KNN (K-Nearest Neighbor) regression method and the multiple linear regression method on Boston Housing data. The method performace referred here is MAE, RMSE, MAPE, and R2. The KNN method is a method to predict something based on the closest training examples of an object. Meanwhile, multiple linear regression is a forecasting technique involving more than one independent variable. The comparison of the two methods is based on the results of the Mean Absolute Percent Error (MAPE). In this research the definitions of distance used are Euclidean distance and Minkowski distance. The K value in the KNN method defines the number of nearest neighbors to be examined to determine the value of a dependent variable, in this research we use K values from 1 to 10 for each test data and definition of distance. In this research, the percentage of test data used was 20%, 30%, and 40% for both methods. The best MAPE value obtained by the KNN regression method was 12,89% at K = 3 for Euclidean distance and 13,22% at K = 3 for Minkowski distance. Meanwhile the best MAPE value for the multiple linear regression method is 17,17%. The best method between the two methods is the KNN regression method as seen from the MAPE value of the KNN regression method which is smaller than the MAPE value of the multiple linear regression method.
邻近K-Nearest回归方法与波士顿-豪斯数据线性回归方法的比较
本研究是为了比较在Boston Housing数据上,KNN (K-Nearest Neighbor)回归方法和多元线性回归方法哪个方法的性能更好。这里提到的方法性能是MAE、RMSE、MAPE和R2。KNN方法是一种基于对象最接近的训练示例来预测事物的方法。同时,多元线性回归是一种涉及多个自变量的预测技术。两种方法的比较是基于平均绝对百分比误差(MAPE)的结果。在本研究中使用的距离定义是欧几里得距离和闵可夫斯基距离。KNN方法中的K值定义了要检查的最近邻居的数量,以确定因变量的值,在本研究中,我们使用K值从1到10来定义每个测试数据和距离。在本研究中,两种方法使用的测试数据百分比分别为20%、30%和40%。在欧氏距离K = 3时,KNN回归方法得到的最佳MAPE值为12.89%,在闵可夫斯基距离K = 3时,MAPE值为13.22%。多元线性回归方法的最佳MAPE值为17、17%。从KNN回归方法的MAPE值小于多元线性回归方法的MAPE值来看,两种方法之间的最佳方法是KNN回归方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信