巴东市劳动力现状分类中CART方法处理不平衡数据的应用

A. Yulianti, F. Fitri, N. Amalita, Dodi Vionanda
{"title":"巴东市劳动力现状分类中CART方法处理不平衡数据的应用","authors":"A. Yulianti, F. Fitri, N. Amalita, Dodi Vionanda","doi":"10.24036/ujsds/vol1-iss3/12","DOIUrl":null,"url":null,"abstract":"Employment issues are one of the main concerns in every country, especially in developing countries including Indonesia. Employment problems faced by Indonesia are the lack of job opportunities, excess labor, and the uneven distribution of labor. This is because the growth of the labor force is higher than the growth of existing job opportunities, so that many workers do not get jobs which will cause unemployment. The city of Padang is the city that has the highest unemployment rate in West Sumatra from 2013 to 2021. The development of a smart city and identification of factors that influence unemployment is one of the efforts to reduce unemployment. This study uses the CART method to determine the factors that affect the number of the workforce in the city of Padang. The advantage of the CART method is that it is easy to interpret the results of the analysis, but the accuracy of the classification tree is low due to data imbalance. Therefore, this study uses the SMOTE method to overcome these problems. The optimal classification tree is formed from 8 terminal nodes and involves 4 explanatory variables consisting of marital status (X3), education level (X4), gender (X2) and age(X1), 5 terminal nodes which classify the labor force into the working category and 3 terminal nodes which classify the labor force into the unemployed category.","PeriodicalId":220933,"journal":{"name":"UNP Journal of Statistics and Data Science","volume":"179 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"The SMOTE Application of CART Methods for Coping Imbalanced Data in Classifying Status Work on Labor Force in the City of Padang\",\"authors\":\"A. Yulianti, F. Fitri, N. Amalita, Dodi Vionanda\",\"doi\":\"10.24036/ujsds/vol1-iss3/12\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Employment issues are one of the main concerns in every country, especially in developing countries including Indonesia. Employment problems faced by Indonesia are the lack of job opportunities, excess labor, and the uneven distribution of labor. This is because the growth of the labor force is higher than the growth of existing job opportunities, so that many workers do not get jobs which will cause unemployment. The city of Padang is the city that has the highest unemployment rate in West Sumatra from 2013 to 2021. The development of a smart city and identification of factors that influence unemployment is one of the efforts to reduce unemployment. This study uses the CART method to determine the factors that affect the number of the workforce in the city of Padang. The advantage of the CART method is that it is easy to interpret the results of the analysis, but the accuracy of the classification tree is low due to data imbalance. Therefore, this study uses the SMOTE method to overcome these problems. The optimal classification tree is formed from 8 terminal nodes and involves 4 explanatory variables consisting of marital status (X3), education level (X4), gender (X2) and age(X1), 5 terminal nodes which classify the labor force into the working category and 3 terminal nodes which classify the labor force into the unemployed category.\",\"PeriodicalId\":220933,\"journal\":{\"name\":\"UNP Journal of Statistics and Data Science\",\"volume\":\"179 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-05-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"UNP Journal of Statistics and Data Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.24036/ujsds/vol1-iss3/12\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"UNP Journal of Statistics and Data Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.24036/ujsds/vol1-iss3/12","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

就业问题是每个国家的主要关切之一,特别是在包括印度尼西亚在内的发展中国家。印尼面临的就业问题是就业机会不足、劳动力过剩、劳动力分配不均。这是因为劳动力的增长高于现有就业机会的增长,因此许多工人没有找到工作,这将导致失业。巴东市是2013年至2021年西苏门答腊失业率最高的城市。发展智慧城市和识别影响失业的因素是减少失业的努力之一。本研究采用CART方法来确定影响巴东市劳动力数量的因素。CART方法的优点是易于解释分析结果,但由于数据不平衡,分类树的准确率较低。因此,本研究采用SMOTE方法来克服这些问题。最优分类树由8个终端节点组成,包含婚姻状况(X3)、教育程度(X4)、性别(X2)、年龄(X1) 4个解释变量,其中5个终端节点将劳动力划分为工作类,3个终端节点将劳动力划分为失业类。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
The SMOTE Application of CART Methods for Coping Imbalanced Data in Classifying Status Work on Labor Force in the City of Padang
Employment issues are one of the main concerns in every country, especially in developing countries including Indonesia. Employment problems faced by Indonesia are the lack of job opportunities, excess labor, and the uneven distribution of labor. This is because the growth of the labor force is higher than the growth of existing job opportunities, so that many workers do not get jobs which will cause unemployment. The city of Padang is the city that has the highest unemployment rate in West Sumatra from 2013 to 2021. The development of a smart city and identification of factors that influence unemployment is one of the efforts to reduce unemployment. This study uses the CART method to determine the factors that affect the number of the workforce in the city of Padang. The advantage of the CART method is that it is easy to interpret the results of the analysis, but the accuracy of the classification tree is low due to data imbalance. Therefore, this study uses the SMOTE method to overcome these problems. The optimal classification tree is formed from 8 terminal nodes and involves 4 explanatory variables consisting of marital status (X3), education level (X4), gender (X2) and age(X1), 5 terminal nodes which classify the labor force into the working category and 3 terminal nodes which classify the labor force into the unemployed category.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信