Pengelompokan Tingkat Kesejahteraan Masyarakat di Sumatera Utara dengan Metode K-Means Clustering

Jurnal Matematika Integratif Pub Date : 2022-01-23 DOI:10.24198/jmi.v17.n2.35025.127-135

Sagita Charolina Sihombing, Dina Agnesia Sihombing

{"title":"Pengelompokan Tingkat Kesejahteraan Masyarakat di Sumatera Utara dengan Metode K-Means Clustering","authors":"Sagita Charolina Sihombing, Dina Agnesia Sihombing","doi":"10.24198/jmi.v17.n2.35025.127-135","DOIUrl":null,"url":null,"abstract":"Grouping the level of community welfare in North Sumatra Province needs to be done to make it easier for the government to focus on development in cities / districts whose welfare levels are still low. In this study, the level of welfare of the people of North Sumatra was grouped based on several variables. The grouping is done using the K-means clustering method. K-means clustering is one of the clustering methods used to classify large amounts of data. This method produces groups of data based on the number of groups desired. In this study, to determine the best number of groups, the Elbow method was used. The first step in this study was to divide the data into groups of data for the number of groups (k) starting from k = 2 to k = 8. Next, calculate the SSE (Sum of Square Error) from cluster k = 2 to k = 8. After that, create an Elbow graph from the resulting SSE values to determine the most optimal amount of k. Data processing to obtain groups based on the number of clusters (k) was carried out using Matlab 2013b software. Group data from the software is stored in Ms.excel. Meanwhile, the resulting Elbow graphic display is created in the Matlab GUI. From the resulting elbow graph, it can be seen that the SSE value has decreased drastically when k = 2 to k = 5, while from k = 5 to k = 8, the decrease in the graph is not significant. From this we know that the optimal number of clusters is k = 5. So, from the elbow graph, the results show that the North Sumatran people are optimally grouped into five clusters. Cluster 1 is only filled by the city of Medan, cluster 2 consists of North Tapanuli Regency, Toba Regency, Simalungun Regency, Dairi Regency, Karo Regency, Langkat Regency, Humbang Hasundutan Regency, West Pakpak Regency, Samosir Regency, Serdang Bedagai Regency, Padangsidimpuan City, Kota Gunungsitoli, cluster 3 consists of Deli Serdang Regency, Pematangsiantar City, Tebingtinggi City, Binjai City, cluster 4 consists of Labuhanbatu Regency, Asahan Regency, Batu Bara Regency, South Labuhanbatu Regency, North Labuhanbatu Regency, Sibolga City, Tanjungbalai City, and cluster 5 consisting of Nias Regency, Mandailing Natal Regency, South Tapanuli Regency, Central Tapanuli Regency, South Nias Regency, North Padang Lawas Regency, Padang Lawas Regency, North Nias Regency, West Nias Regency.","PeriodicalId":53096,"journal":{"name":"Jurnal Matematika Integratif","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Jurnal Matematika Integratif","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.24198/jmi.v17.n2.35025.127-135","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Grouping the level of community welfare in North Sumatra Province needs to be done to make it easier for the government to focus on development in cities / districts whose welfare levels are still low. In this study, the level of welfare of the people of North Sumatra was grouped based on several variables. The grouping is done using the K-means clustering method. K-means clustering is one of the clustering methods used to classify large amounts of data. This method produces groups of data based on the number of groups desired. In this study, to determine the best number of groups, the Elbow method was used. The first step in this study was to divide the data into groups of data for the number of groups (k) starting from k = 2 to k = 8. Next, calculate the SSE (Sum of Square Error) from cluster k = 2 to k = 8. After that, create an Elbow graph from the resulting SSE values to determine the most optimal amount of k. Data processing to obtain groups based on the number of clusters (k) was carried out using Matlab 2013b software. Group data from the software is stored in Ms.excel. Meanwhile, the resulting Elbow graphic display is created in the Matlab GUI. From the resulting elbow graph, it can be seen that the SSE value has decreased drastically when k = 2 to k = 5, while from k = 5 to k = 8, the decrease in the graph is not significant. From this we know that the optimal number of clusters is k = 5. So, from the elbow graph, the results show that the North Sumatran people are optimally grouped into five clusters. Cluster 1 is only filled by the city of Medan, cluster 2 consists of North Tapanuli Regency, Toba Regency, Simalungun Regency, Dairi Regency, Karo Regency, Langkat Regency, Humbang Hasundutan Regency, West Pakpak Regency, Samosir Regency, Serdang Bedagai Regency, Padangsidimpuan City, Kota Gunungsitoli, cluster 3 consists of Deli Serdang Regency, Pematangsiantar City, Tebingtinggi City, Binjai City, cluster 4 consists of Labuhanbatu Regency, Asahan Regency, Batu Bara Regency, South Labuhanbatu Regency, North Labuhanbatu Regency, Sibolga City, Tanjungbalai City, and cluster 5 consisting of Nias Regency, Mandailing Natal Regency, South Tapanuli Regency, Central Tapanuli Regency, South Nias Regency, North Padang Lawas Regency, Padang Lawas Regency, North Nias Regency, West Nias Regency.

查看原文本刊更多论文

苏门答腊岛北部社会福利水平的K-均值聚类分析

需要对北苏门答腊省的社区福利水平进行分组，以使政府更容易将重点放在福利水平仍然较低的城市/地区的发展上。在这项研究中，北苏门答腊人民的福利水平基于几个变量进行了分组。使用K-means聚类方法进行分组。K-means聚类是用于对大量数据进行分类的聚类方法之一。此方法根据所需的组数生成数据组。在本研究中，为了确定最佳组数，使用了Elbow方法。本研究的第一步是将数据划分为组数（k）的数据组，从k=2到k=8。接下来，计算从聚类k=2到k=8的SSE（平方误差之和）。之后，根据得到的SSE值创建Elbow图，以确定k的最佳数量。使用Matlab 2013b软件进行数据处理，以获得基于聚类数量（k）的组。软件中的组数据存储在Ms.excel中。同时，在Matlab GUI中创建Elbow图形显示。从所得的肘曲线图中可以看出，当k=2到k=5时，SSE值急剧下降，而从k=5到k=8，曲线图中的下降并不显著。由此我们知道簇的最佳数量为k=5。因此，从肘部图来看，结果表明北苏门答腊人被最佳地分为五个集群。第1组仅由棉兰市组成，第2组由北塔帕努利县、托巴县、西马伦贡县、Dairi县、Karo县、Langkat县、Humbang Hasundutan县、West Pakpak县、Samosir县、Serdang Bedagai县、巴东西丁潘市、Kota Gunungstoli市组成，Binjai市，第4集群由Labuhanbatu Regency、Asahan Regency和Batu Bara Regency组成，南Labuhanbatu Regencyy、北Labuhankatu Regency.Sibolga市、丹戎巴莱市，第5集群由Nias Regency，Mandailing Natal Regency。南Tapanuli Regency，West Nias Regency。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Jurnal Matematika Integratif

自引率

0.00%

发文量

审稿时长

12 weeks