Comparison of machine learning predictions of subjective poverty in rural China

IF 4.4 2区 经济学 Q1 AGRICULTURAL ECONOMICS & POLICY
Lucie Maruejols, Hanjie Wang, Qiran Zhao, Yunli Bai, Linxiu Zhang
{"title":"Comparison of machine learning predictions of subjective poverty in rural China","authors":"Lucie Maruejols, Hanjie Wang, Qiran Zhao, Yunli Bai, Linxiu Zhang","doi":"10.1108/caer-03-2022-0051","DOIUrl":null,"url":null,"abstract":"PurposeDespite rising incomes and reduction of extreme poverty, the feeling of being poor remains widespread. Support programs can improve well-being, but they first require identifying who are the households that judge their income is insufficient to meet their basic needs, and what factors are associated with subjective poverty.Design/methodology/approachHouseholds report the income level they judge is sufficient to make ends meet. Then, they are classified as being subjectively poor if their own monetary income is inferior to the level they indicated. Second, the study compares the performance of three machine learning algorithms, the random forest, support vector machines and least absolute shrinkage and selection operator (LASSO) regression, applied to a set of socioeconomic variables to predict subjective poverty status.FindingsThe random forest generates 85.29% of correct predictions using a range of income and non-income predictors, closely followed by the other two techniques. For the middle-income group, the LASSO regression outperforms random forest. Subjective poverty is mostly associated with monetary income for low-income households. However, a combination of low income, low endowment (land, consumption assets) and unusual large expenditure (medical, gifts) constitutes the key predictors of feeling poor for the middle-income households.Practical implicationsTo reduce the feeling of poverty, policy intervention should continue to focus on increasing incomes. However, improvements in nonincome domains such as health expenditure, education and family demographics can also relieve the feeling of income inadequacy. Methodologically, better performance of either algorithm depends on the data at hand.Originality/valueFor the first time, the authors show that prediction techniques are reliable to identify subjective poverty prevalence, with example from rural China. The analysis offers specific attention to the modest-income households, who may feel poor but not be identified as such by objective poverty lines, and is relevant when policy-makers seek to address the “next step” after ending extreme poverty. Prediction performance and mechanisms for three machine learning algorithms are compared.","PeriodicalId":10095,"journal":{"name":"China Agricultural Economic Review","volume":" ","pages":""},"PeriodicalIF":4.4000,"publicationDate":"2022-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"China Agricultural Economic Review","FirstCategoryId":"96","ListUrlMain":"https://doi.org/10.1108/caer-03-2022-0051","RegionNum":2,"RegionCategory":"经济学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURAL ECONOMICS & POLICY","Score":null,"Total":0}
引用次数: 3

Abstract

PurposeDespite rising incomes and reduction of extreme poverty, the feeling of being poor remains widespread. Support programs can improve well-being, but they first require identifying who are the households that judge their income is insufficient to meet their basic needs, and what factors are associated with subjective poverty.Design/methodology/approachHouseholds report the income level they judge is sufficient to make ends meet. Then, they are classified as being subjectively poor if their own monetary income is inferior to the level they indicated. Second, the study compares the performance of three machine learning algorithms, the random forest, support vector machines and least absolute shrinkage and selection operator (LASSO) regression, applied to a set of socioeconomic variables to predict subjective poverty status.FindingsThe random forest generates 85.29% of correct predictions using a range of income and non-income predictors, closely followed by the other two techniques. For the middle-income group, the LASSO regression outperforms random forest. Subjective poverty is mostly associated with monetary income for low-income households. However, a combination of low income, low endowment (land, consumption assets) and unusual large expenditure (medical, gifts) constitutes the key predictors of feeling poor for the middle-income households.Practical implicationsTo reduce the feeling of poverty, policy intervention should continue to focus on increasing incomes. However, improvements in nonincome domains such as health expenditure, education and family demographics can also relieve the feeling of income inadequacy. Methodologically, better performance of either algorithm depends on the data at hand.Originality/valueFor the first time, the authors show that prediction techniques are reliable to identify subjective poverty prevalence, with example from rural China. The analysis offers specific attention to the modest-income households, who may feel poor but not be identified as such by objective poverty lines, and is relevant when policy-makers seek to address the “next step” after ending extreme poverty. Prediction performance and mechanisms for three machine learning algorithms are compared.
中国农村主观贫困的机器学习预测比较
目的尽管收入增加,极端贫困现象减少,但贫困感仍然普遍存在。支持计划可以改善福利,但首先需要确定哪些家庭认为他们的收入不足以满足他们的基本需求,以及哪些因素与主观贫困有关。设计/方法/方法家庭报告他们认为足以维持生计的收入水平。然后,如果他们自己的货币收入低于他们所表示的水平,他们就被归类为主观贫困。其次,本研究比较了三种机器学习算法的性能,即随机森林、支持向量机和最小绝对收缩和选择算子(LASSO)回归,应用于一组社会经济变量来预测主观贫困状况。发现随机森林使用一系列收入和非收入预测因子产生了85.29%的正确预测,紧随其后的是其他两种技术。对于中等收入群体,LASSO回归优于随机森林。主观贫困主要与低收入家庭的货币收入有关。然而,低收入、低禀赋(土地、消费资产)和不寻常的大额支出(医疗、礼物)是中等收入家庭感到贫穷的关键预测因素。实际含义为了减少贫困感,政策干预应继续侧重于增加收入。然而,医疗支出、教育和家庭人口统计等非收入领域的改善也可以缓解收入不足的感觉。从方法上讲,两种算法的更好性能取决于手头的数据。原创性/价值首次,作者以中国农村为例,证明了预测技术在识别主观贫困率方面是可靠的。该分析特别关注了中等收入家庭,他们可能感到贫困,但无法通过客观贫困线确定贫困,当决策者寻求解决结束极端贫困后的“下一步”问题时,该分析具有相关性。比较了三种机器学习算法的预测性能和机制。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
China Agricultural Economic Review
China Agricultural Economic Review AGRICULTURAL ECONOMICS & POLICY-
CiteScore
9.80
自引率
5.90%
发文量
41
审稿时长
>12 weeks
期刊介绍: Published in association with China Agricultural University and the Chinese Association for Agricultural Economics, China Agricultural Economic Review publishes academic writings by international scholars, and particularly encourages empirical work that can be replicated and extended by others; and research articles that employ econometric and statistical hypothesis testing, optimization and simulation models. The journal aims to publish research which can be applied to China’s agricultural and rural policy-making process, the development of the agricultural economics discipline and to developing countries hoping to learn from China’s agricultural and rural development.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信