利用贝叶斯加性回归树模型分解人口普查数据以绘制人口分布图

IF 4 2区 地球科学 Q1 GEOGRAPHY
Ortis Yankey, Chigozie E. Utazi, Christopher C. Nnanatu, Assane N. Gadiaga, Thomas Abbot, Attila N. Lazar, Andrew J. Tatem
{"title":"利用贝叶斯加性回归树模型分解人口普查数据以绘制人口分布图","authors":"Ortis Yankey,&nbsp;Chigozie E. Utazi,&nbsp;Christopher C. Nnanatu,&nbsp;Assane N. Gadiaga,&nbsp;Thomas Abbot,&nbsp;Attila N. Lazar,&nbsp;Andrew J. Tatem","doi":"10.1016/j.apgeog.2024.103416","DOIUrl":null,"url":null,"abstract":"<div><p>Population data is crucial for policy decisions, but fine-scale population numbers are often lacking due to the challenge of sharing sensitive data. Different approaches, such as the use of the Random Forest (RF) model, have been used to disaggregate census data from higher administrative units to small area scales. A major limitation of the RF model is its inability to quantify the uncertainties associated with the predicted populations, which can be important for policy decisions. In this study, we applied a Bayesian Additive Regression Tree (BART) model for population disaggregation and compared the result with a RF model using both simulated data and the 2021 census data for Ghana. The BART model consistently outperforms the RF model in out-of-sample predictions for all metrics, such as bias, mean squared error (MSE), and root mean squared error (RMSE). The BART model also addresses the limitations of the RF model by providing uncertainty estimates around the predicted population, which is often lacking with the RF model. Overall, the study demonstrates the superiority of the BART model over the RF model in disaggregating population data and highlights its potential for gridded population estimates.</p></div>","PeriodicalId":48396,"journal":{"name":"Applied Geography","volume":"172 ","pages":"Article 103416"},"PeriodicalIF":4.0000,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0143622824002212/pdfft?md5=44f880423c98386303972ce33803cc16&pid=1-s2.0-S0143622824002212-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Disaggregating census data for population mapping using a Bayesian Additive Regression Tree model\",\"authors\":\"Ortis Yankey,&nbsp;Chigozie E. Utazi,&nbsp;Christopher C. Nnanatu,&nbsp;Assane N. Gadiaga,&nbsp;Thomas Abbot,&nbsp;Attila N. Lazar,&nbsp;Andrew J. Tatem\",\"doi\":\"10.1016/j.apgeog.2024.103416\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Population data is crucial for policy decisions, but fine-scale population numbers are often lacking due to the challenge of sharing sensitive data. Different approaches, such as the use of the Random Forest (RF) model, have been used to disaggregate census data from higher administrative units to small area scales. A major limitation of the RF model is its inability to quantify the uncertainties associated with the predicted populations, which can be important for policy decisions. In this study, we applied a Bayesian Additive Regression Tree (BART) model for population disaggregation and compared the result with a RF model using both simulated data and the 2021 census data for Ghana. The BART model consistently outperforms the RF model in out-of-sample predictions for all metrics, such as bias, mean squared error (MSE), and root mean squared error (RMSE). The BART model also addresses the limitations of the RF model by providing uncertainty estimates around the predicted population, which is often lacking with the RF model. Overall, the study demonstrates the superiority of the BART model over the RF model in disaggregating population data and highlights its potential for gridded population estimates.</p></div>\",\"PeriodicalId\":48396,\"journal\":{\"name\":\"Applied Geography\",\"volume\":\"172 \",\"pages\":\"Article 103416\"},\"PeriodicalIF\":4.0000,\"publicationDate\":\"2024-09-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S0143622824002212/pdfft?md5=44f880423c98386303972ce33803cc16&pid=1-s2.0-S0143622824002212-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Geography\",\"FirstCategoryId\":\"89\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0143622824002212\",\"RegionNum\":2,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"GEOGRAPHY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Geography","FirstCategoryId":"89","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0143622824002212","RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GEOGRAPHY","Score":null,"Total":0}
引用次数: 0

摘要

人口数据对决策至关重要,但由于共享敏感数据的难题,往往缺乏精细的人口数字。人们采用不同的方法,如使用随机森林(RF)模型,将普查数据从较高的行政单位分解到较小的区域范围。随机森林模型的一个主要局限是无法量化与预测人口相关的不确定性,而这些不确定性对于政策决策可能非常重要。在本研究中,我们采用贝叶斯加性回归树(BART)模型进行人口分解,并使用模拟数据和加纳 2021 年人口普查数据将结果与 RF 模型进行比较。就偏差、均方误差 (MSE) 和均方根误差 (RMSE) 等所有指标而言,BART 模型在样本外预测方面始终优于 RF 模型。BART 模型还解决了 RF 模型的局限性,提供了预测人口周围的不确定性估计值,而 RF 模型往往缺乏这种估计值。总之,该研究证明了 BART 模型在分解人口数据方面优于 RF 模型,并突出了其在网格化人口估计方面的潜力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Disaggregating census data for population mapping using a Bayesian Additive Regression Tree model

Population data is crucial for policy decisions, but fine-scale population numbers are often lacking due to the challenge of sharing sensitive data. Different approaches, such as the use of the Random Forest (RF) model, have been used to disaggregate census data from higher administrative units to small area scales. A major limitation of the RF model is its inability to quantify the uncertainties associated with the predicted populations, which can be important for policy decisions. In this study, we applied a Bayesian Additive Regression Tree (BART) model for population disaggregation and compared the result with a RF model using both simulated data and the 2021 census data for Ghana. The BART model consistently outperforms the RF model in out-of-sample predictions for all metrics, such as bias, mean squared error (MSE), and root mean squared error (RMSE). The BART model also addresses the limitations of the RF model by providing uncertainty estimates around the predicted population, which is often lacking with the RF model. Overall, the study demonstrates the superiority of the BART model over the RF model in disaggregating population data and highlights its potential for gridded population estimates.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Applied Geography
Applied Geography GEOGRAPHY-
CiteScore
8.00
自引率
2.00%
发文量
134
期刊介绍: Applied Geography is a journal devoted to the publication of research which utilizes geographic approaches (human, physical, nature-society and GIScience) to resolve human problems that have a spatial dimension. These problems may be related to the assessment, management and allocation of the world physical and/or human resources. The underlying rationale of the journal is that only through a clear understanding of the relevant societal, physical, and coupled natural-humans systems can we resolve such problems. Papers are invited on any theme involving the application of geographical theory and methodology in the resolution of human problems.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信