{"title":"Predictability of Belgian residential real estate rents using tree-based ML models and IML techniques","authors":"Ian Lenaers, Kris Boudt, Lieven De Moor","doi":"10.1108/ijhma-11-2022-0172","DOIUrl":null,"url":null,"abstract":"\nPurpose\nThe purpose is twofold. First, this study aims to establish that black box tree-based machine learning (ML) models have better predictive performance than a standard linear regression (LR) hedonic model for rent prediction. Second, it shows the added value of analyzing tree-based ML models with interpretable machine learning (IML) techniques.\n\n\nDesign/methodology/approach\nData on Belgian residential rental properties were collected. Tree-based ML models, random forest regression and eXtreme gradient boosting regression were applied to derive rent prediction models to compare predictive performance with a LR model. Interpretations of the tree-based models regarding important factors in predicting rent were made using SHapley Additive exPlanations (SHAP) feature importance (FI) plots and SHAP summary plots.\n\n\nFindings\nResults indicate that tree-based models perform better than a LR model for Belgian residential rent prediction. The SHAP FI plots agree that asking price, cadastral income, surface livable, number of bedrooms, number of bathrooms and variables measuring the proximity to points of interest are dominant predictors. The direction of relationships between rent and its factors is determined with SHAP summary plots. In addition to linear relationships, it emerges that nonlinear relationships exist.\n\n\nOriginality/value\nRent prediction using ML is relatively less studied than house price prediction. In addition, studying prediction models using IML techniques is relatively new in real estate economics. Moreover, to the best of the authors’ knowledge, this study is the first to derive insights of driving determinants of predicted rents from SHAP FI and SHAP summary plots.\n","PeriodicalId":14136,"journal":{"name":"International Journal of Housing Markets and Analysis","volume":" ","pages":""},"PeriodicalIF":1.5000,"publicationDate":"2023-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Housing Markets and Analysis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1108/ijhma-11-2022-0172","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"URBAN STUDIES","Score":null,"Total":0}
引用次数: 1
Abstract
Purpose
The purpose is twofold. First, this study aims to establish that black box tree-based machine learning (ML) models have better predictive performance than a standard linear regression (LR) hedonic model for rent prediction. Second, it shows the added value of analyzing tree-based ML models with interpretable machine learning (IML) techniques.
Design/methodology/approach
Data on Belgian residential rental properties were collected. Tree-based ML models, random forest regression and eXtreme gradient boosting regression were applied to derive rent prediction models to compare predictive performance with a LR model. Interpretations of the tree-based models regarding important factors in predicting rent were made using SHapley Additive exPlanations (SHAP) feature importance (FI) plots and SHAP summary plots.
Findings
Results indicate that tree-based models perform better than a LR model for Belgian residential rent prediction. The SHAP FI plots agree that asking price, cadastral income, surface livable, number of bedrooms, number of bathrooms and variables measuring the proximity to points of interest are dominant predictors. The direction of relationships between rent and its factors is determined with SHAP summary plots. In addition to linear relationships, it emerges that nonlinear relationships exist.
Originality/value
Rent prediction using ML is relatively less studied than house price prediction. In addition, studying prediction models using IML techniques is relatively new in real estate economics. Moreover, to the best of the authors’ knowledge, this study is the first to derive insights of driving determinants of predicted rents from SHAP FI and SHAP summary plots.