{"title":"房屋价值评估的机器学习方法比较","authors":"Orton Babb","doi":"10.1137/18s017296","DOIUrl":null,"url":null,"abstract":"Housing value estimation relies on hedonic pricing models whereby price is determined by both internal characteristics (bedrooms, bathrooms, living area, etc.) as well as external characteristics (neighboring houses, ZIP code, etc.). While classical parametric models based on linear regression analysis have been well studied in this application, the theory of hedonic prices places no restrictions on the hedonic price functional form, and hence, more recent research has attempted to apply machine learning (ML) approaches such as K-Nearest Neighbors and Support Vector Machine Regression (SVR). Many of these ML methods are employed on the basis of their flexibility in terms of making less assumptions on the shape or distribution of the data. ML models are therefore used with the expectation of higher accuracy on predicting the final sale price of a house. In this study, we consider the combination of various pre-processing procedures and candidate models on a historical data set of house sales in King County, Washington. Different measures of accuracy are considered in interpreting model performance. The results suggest that while machine learning algorithms like SVR achieve top performance as measured by the adjusted R, classical parametric models can also achieve out-of-sample generalization nearing that of the more sophisticated ML models, with faster training times, no need for feature scaling and more easily interpreted parameters.","PeriodicalId":93373,"journal":{"name":"SIAM undergraduate research online","volume":"57 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"A Comparison of Machine Learning Approaches to Housing Value Estimation\",\"authors\":\"Orton Babb\",\"doi\":\"10.1137/18s017296\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Housing value estimation relies on hedonic pricing models whereby price is determined by both internal characteristics (bedrooms, bathrooms, living area, etc.) as well as external characteristics (neighboring houses, ZIP code, etc.). While classical parametric models based on linear regression analysis have been well studied in this application, the theory of hedonic prices places no restrictions on the hedonic price functional form, and hence, more recent research has attempted to apply machine learning (ML) approaches such as K-Nearest Neighbors and Support Vector Machine Regression (SVR). Many of these ML methods are employed on the basis of their flexibility in terms of making less assumptions on the shape or distribution of the data. ML models are therefore used with the expectation of higher accuracy on predicting the final sale price of a house. In this study, we consider the combination of various pre-processing procedures and candidate models on a historical data set of house sales in King County, Washington. Different measures of accuracy are considered in interpreting model performance. The results suggest that while machine learning algorithms like SVR achieve top performance as measured by the adjusted R, classical parametric models can also achieve out-of-sample generalization nearing that of the more sophisticated ML models, with faster training times, no need for feature scaling and more easily interpreted parameters.\",\"PeriodicalId\":93373,\"journal\":{\"name\":\"SIAM undergraduate research online\",\"volume\":\"57 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"SIAM undergraduate research online\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1137/18s017296\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"SIAM undergraduate research online","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1137/18s017296","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Comparison of Machine Learning Approaches to Housing Value Estimation
Housing value estimation relies on hedonic pricing models whereby price is determined by both internal characteristics (bedrooms, bathrooms, living area, etc.) as well as external characteristics (neighboring houses, ZIP code, etc.). While classical parametric models based on linear regression analysis have been well studied in this application, the theory of hedonic prices places no restrictions on the hedonic price functional form, and hence, more recent research has attempted to apply machine learning (ML) approaches such as K-Nearest Neighbors and Support Vector Machine Regression (SVR). Many of these ML methods are employed on the basis of their flexibility in terms of making less assumptions on the shape or distribution of the data. ML models are therefore used with the expectation of higher accuracy on predicting the final sale price of a house. In this study, we consider the combination of various pre-processing procedures and candidate models on a historical data set of house sales in King County, Washington. Different measures of accuracy are considered in interpreting model performance. The results suggest that while machine learning algorithms like SVR achieve top performance as measured by the adjusted R, classical parametric models can also achieve out-of-sample generalization nearing that of the more sophisticated ML models, with faster training times, no need for feature scaling and more easily interpreted parameters.