Data Analytics of Human Development Index(HDI) with Features Descriptive and Predictive Mining

Ferry Astika Saputra, A. Barakbah, Putri Riza Rokhmawati
{"title":"Data Analytics of Human Development Index(HDI) with Features Descriptive and Predictive Mining","authors":"Ferry Astika Saputra, A. Barakbah, Putri Riza Rokhmawati","doi":"10.1109/IES50839.2020.9231661","DOIUrl":null,"url":null,"abstract":"The value of the Human Development Index (HDI) in Indonesia is increasing every year. Indonesia has many provinces and districts/cities, it makes the government need more time to analyze data. This research purpose a new method to analyze the data of HDI with a descriptive and predictive mining method. There are two main results of this research. First, a segmentation of HDI data into four segments, there are low, medium, high, and very high. Second, a prediction of HDI data. Before analyzing data, the system does data preprocessing to repair the missing data (cleaning) and normalization (transformation) to convert data into a smaller range(from 0 to 1). To get a segmentation result use the descriptive mining method, in this method, there are two steps, the first system does grouping and labeling data based on the value of HDI indicators(life expectancy, expected years of schooling, mean years of schooling and income per capita) use Hierarchical Clustering Centroid Linkage Method. Second, the system does the interpretation process based on the distance between centroid every cluster and ground(0,0). To get a prediction result use the predictive mining method, this process uses a Weighted Moving Average(WMA) with the last three years of HDI data. The result of this research, the variance accuracy value of the descriptive mining method is 0,203, and the Mean Absolute Percentage Error(MAPE) value of the predictive mining method is 0,27%.","PeriodicalId":344685,"journal":{"name":"2020 International Electronics Symposium (IES)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Electronics Symposium (IES)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IES50839.2020.9231661","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

The value of the Human Development Index (HDI) in Indonesia is increasing every year. Indonesia has many provinces and districts/cities, it makes the government need more time to analyze data. This research purpose a new method to analyze the data of HDI with a descriptive and predictive mining method. There are two main results of this research. First, a segmentation of HDI data into four segments, there are low, medium, high, and very high. Second, a prediction of HDI data. Before analyzing data, the system does data preprocessing to repair the missing data (cleaning) and normalization (transformation) to convert data into a smaller range(from 0 to 1). To get a segmentation result use the descriptive mining method, in this method, there are two steps, the first system does grouping and labeling data based on the value of HDI indicators(life expectancy, expected years of schooling, mean years of schooling and income per capita) use Hierarchical Clustering Centroid Linkage Method. Second, the system does the interpretation process based on the distance between centroid every cluster and ground(0,0). To get a prediction result use the predictive mining method, this process uses a Weighted Moving Average(WMA) with the last three years of HDI data. The result of this research, the variance accuracy value of the descriptive mining method is 0,203, and the Mean Absolute Percentage Error(MAPE) value of the predictive mining method is 0,27%.
基于描述性和预测性挖掘特征的人类发展指数数据分析
印度尼西亚的人类发展指数(HDI)每年都在上升。印度尼西亚有许多省和区/市,这使得政府需要更多的时间来分析数据。本研究旨在利用描述性和预测性的挖掘方法对HDI数据进行分析。这项研究有两个主要结果。首先,将HDI数据分割成四个段,分别是低、中、高和很高。二是对人类发展指数数据的预测。在分析数据之前,系统对数据进行预处理,对缺失数据进行修复(清洗),对数据进行归一化(转换),将数据转换为较小的范围(从0到1)。为了得到分割结果,使用描述性挖掘方法,在该方法中,有两个步骤,第一步系统根据HDI指标(预期寿命,期望受教育年限,期望受教育年限,期望受教育年限)的值对数据进行分组和标记。平均受教育年数和人均收入)采用层次聚类质心联动法。其次,系统根据每簇质心与地面的距离(0,0)进行解释处理。为了使用预测挖掘方法获得预测结果,该过程使用近三年HDI数据的加权移动平均(WMA)。研究结果表明,描述性挖掘方法的方差精度值为0.0203,预测性挖掘方法的平均绝对百分比误差(MAPE)值为0.27%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信