{"title":"Applying C atboost Regression Model for Prediction of House Prices","authors":"Rafea. M. Almejrb, O. Sallabi, A. Mohamed","doi":"10.1109/ICEMIS56295.2022.9914345","DOIUrl":null,"url":null,"abstract":"Every year, the cost of houses rises, demanding the development of a method to forecast home prices in the future. This paper aims to apply the Catboost regression model by changing the iteration and learning rate. A feature and house value dataset for King County, Washington, is used. During the pre-processing of the data, extreme values are winsorized, and features with a high degree of correlation are removed. There are twelve possible models, including Catboost, RandomForestRegressor, and KNeighborsRegressor. Several metrics are used to assess them, such as Mean Squared Error (MSE), R-squared score, Mean Absolute Error (MAE), and Root Mean Squared Error (RMSE) (MSE). The model with a low RMSE of 0.013166 performs well compared to other research, especially in the test set where R2 is 0.915256. In this study, Catboost is the model that performs the best overall and can be used to estimate home prices. The most significant factors affecting property prices are location, living area, and house condition. It is confirmed that the conclusions in this research are consistent with real-world experience after comparing and contrasting with other works.","PeriodicalId":191284,"journal":{"name":"2022 International Conference on Engineering & MIS (ICEMIS)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Engineering & MIS (ICEMIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICEMIS56295.2022.9914345","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Every year, the cost of houses rises, demanding the development of a method to forecast home prices in the future. This paper aims to apply the Catboost regression model by changing the iteration and learning rate. A feature and house value dataset for King County, Washington, is used. During the pre-processing of the data, extreme values are winsorized, and features with a high degree of correlation are removed. There are twelve possible models, including Catboost, RandomForestRegressor, and KNeighborsRegressor. Several metrics are used to assess them, such as Mean Squared Error (MSE), R-squared score, Mean Absolute Error (MAE), and Root Mean Squared Error (RMSE) (MSE). The model with a low RMSE of 0.013166 performs well compared to other research, especially in the test set where R2 is 0.915256. In this study, Catboost is the model that performs the best overall and can be used to estimate home prices. The most significant factors affecting property prices are location, living area, and house condition. It is confirmed that the conclusions in this research are consistent with real-world experience after comparing and contrasting with other works.