Application of Random Forest to Identify for Poor Households in West Sumatera Province

Febri Ramayanti, Dodi Vionanda, Dony Permana, Zilrahmi
{"title":"Application of Random Forest to Identify for Poor Households in West Sumatera Province","authors":"Febri Ramayanti, Dodi Vionanda, Dony Permana, Zilrahmi","doi":"10.24036/ujsds/vol1-iss2/31","DOIUrl":null,"url":null,"abstract":"Poverty is a socioeconomic problem in Indonesia. The number of people who were living in poverty in West Sumatera increases for 26.44 thousands from 2020 to 2021. The government has created programs to cope with poverty by taking into account the criteria for the poor households. These criteria have been developed by using the data obtained through The National Socioeconomic Survey (Susenas). However, instead of.showing the actual location of poor household, the existing data only interprets the number of poor household. Thus make the program less effective. This could be overcome by classification analysis of random forest (RF). RF is collection of many decision trees. Before fitting RF, one has to determine the values if three tuning parameters, mtry, ntree and node size. The result are the smallest  OOB’s error rate (%) and Variable Importance Measure(VIM). The classification by RF in this research results in OOB’s error rate was 5.65% or accuracy rate was 94.35%  with tuning parameter using mtry=5 and ntree=500. Based on the VIM, the poor household’s criteria include sources of drinking water such as protected or unprotected spring water and surface water, lighting tools such as non-PLN electricity or no usage of electricity, fuel for cooking such as charcoal and firewood, and the head of the household being self-employed, a family worker, or unpaid with at least a junior high degree.","PeriodicalId":220933,"journal":{"name":"UNP Journal of Statistics and Data Science","volume":"54 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"UNP Journal of Statistics and Data Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.24036/ujsds/vol1-iss2/31","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Poverty is a socioeconomic problem in Indonesia. The number of people who were living in poverty in West Sumatera increases for 26.44 thousands from 2020 to 2021. The government has created programs to cope with poverty by taking into account the criteria for the poor households. These criteria have been developed by using the data obtained through The National Socioeconomic Survey (Susenas). However, instead of.showing the actual location of poor household, the existing data only interprets the number of poor household. Thus make the program less effective. This could be overcome by classification analysis of random forest (RF). RF is collection of many decision trees. Before fitting RF, one has to determine the values if three tuning parameters, mtry, ntree and node size. The result are the smallest  OOB’s error rate (%) and Variable Importance Measure(VIM). The classification by RF in this research results in OOB’s error rate was 5.65% or accuracy rate was 94.35%  with tuning parameter using mtry=5 and ntree=500. Based on the VIM, the poor household’s criteria include sources of drinking water such as protected or unprotected spring water and surface water, lighting tools such as non-PLN electricity or no usage of electricity, fuel for cooking such as charcoal and firewood, and the head of the household being self-employed, a family worker, or unpaid with at least a junior high degree.
随机森林在西苏门答腊省贫困户识别中的应用
贫困是印尼的一个社会经济问题。从2020年到2021年,西苏门答腊的贫困人口增加了26.44万人。政府根据贫困家庭的标准制定了解决贫困问题的方案。这些标准是利用国家社会经济调查(Susenas)获得的数据制定的。然而,而不是。现有的数据显示了贫困家庭的实际位置,只是解释了贫困家庭的数量。从而使程序效率降低。这可以通过随机森林(RF)的分类分析来克服。RF是许多决策树的集合。在拟合RF之前,必须确定三个调优参数的值,即try、n树和节点大小。结果是最小的OOB错误率(%)和可变重要性度量(VIM)。在本研究中,采用射频分类方法对OOB进行分类时,错误率为5.65%,准确率为94.35%,调整参数为mtry=5, ntree=500。根据VIM,贫困家庭的标准包括饮用水来源,如受保护或不受保护的泉水和地表水,照明工具,如非pln电力或不使用电力,烹饪燃料,如木炭和木柴,以及户主是自雇人士、家庭工人或至少具有初中学历的无薪家庭。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信