Research and Demonstration of a Prediction Algorithm Based on GBDT-LightGBM Algorithm

Houzhi Chen, Zichun Liu, Minyan Dai
{"title":"Research and Demonstration of a Prediction Algorithm Based on GBDT-LightGBM Algorithm","authors":"Houzhi Chen, Zichun Liu, Minyan Dai","doi":"10.1109/ACEDPI58926.2023.00049","DOIUrl":null,"url":null,"abstract":"This paper aims to use the LightGBM algorithm to predict housing prices in Chinese municipalities. According to previous research experience, from the demand level, supply level, and regulation policy three main aspects as the main influencing factors of housing prices are analyzed and predicted. In the case analysis, the determination coefficient (R-Square) and the average absolute percentage error (MAPE) are used to test the accuracy of the model, and the Kendall tau-b (K) method in the Kendall coefficient is used for correlation analysis and consistency test. After eliminating the repeatability index, the Kendall coordination coefficient W of the model is 0.977. After selecting the appropriate influencing factors and data, the Light Gradient Boosting Machine is trained by using the gradient boosting decision tree GBDT as the base learner and SPSS-PRO software. It is found that the model has the highest accuracy when the number of base learners is 500. From the training results, the influence degree of demand and policy level is the largest, and the influence degree of supply level is small. The influence degree of the three is 47 %, 43 %, and 10 %. In the secondary indicators, the main business tax and additional, urbanization rate, and a loan amount of real estate development enterprises have a greater impact. The R-square of the training set and the test set are 0.905 and 0.902, respectively. The accuracy of the model is high, which provides an effective reference for housing price prediction.","PeriodicalId":124469,"journal":{"name":"2023 Asia-Europe Conference on Electronics, Data Processing and Informatics (ACEDPI)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 Asia-Europe Conference on Electronics, Data Processing and Informatics (ACEDPI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ACEDPI58926.2023.00049","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

This paper aims to use the LightGBM algorithm to predict housing prices in Chinese municipalities. According to previous research experience, from the demand level, supply level, and regulation policy three main aspects as the main influencing factors of housing prices are analyzed and predicted. In the case analysis, the determination coefficient (R-Square) and the average absolute percentage error (MAPE) are used to test the accuracy of the model, and the Kendall tau-b (K) method in the Kendall coefficient is used for correlation analysis and consistency test. After eliminating the repeatability index, the Kendall coordination coefficient W of the model is 0.977. After selecting the appropriate influencing factors and data, the Light Gradient Boosting Machine is trained by using the gradient boosting decision tree GBDT as the base learner and SPSS-PRO software. It is found that the model has the highest accuracy when the number of base learners is 500. From the training results, the influence degree of demand and policy level is the largest, and the influence degree of supply level is small. The influence degree of the three is 47 %, 43 %, and 10 %. In the secondary indicators, the main business tax and additional, urbanization rate, and a loan amount of real estate development enterprises have a greater impact. The R-square of the training set and the test set are 0.905 and 0.902, respectively. The accuracy of the model is high, which provides an effective reference for housing price prediction.
基于GBDT-LightGBM算法的预测算法研究与论证
本文旨在利用LightGBM算法对中国城市的房价进行预测。根据以往的研究经验,从需求层面、供给层面和调控政策三个主要方面作为房价的主要影响因素进行分析和预测。在案例分析中,使用决定系数(R-Square)和平均绝对百分比误差(MAPE)来检验模型的准确性,使用Kendall系数中的Kendall tau-b (K)方法进行相关性分析和一致性检验。剔除重复性指标后,模型的Kendall协调系数W为0.977。选择合适的影响因素和数据后,以梯度增强决策树GBDT为基础学习器,结合SPSS-PRO软件对光梯度增强机进行训练。结果表明,当基学习器个数为500时,该模型的准确率最高。从培训结果来看,需求和政策层面的影响程度最大,供给层面的影响程度较小。三者的影响程度分别为47%、43%和10%。在二级指标中,主要营业税和附加、城镇化率、房地产开发企业的贷款额影响较大。训练集和测试集的r平方分别为0.905和0.902。该模型的准确性较高,为房价预测提供了有效的参考。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信