Nabillah Putri Shalsabila, N. Amalita, Dodi Vionanda, D. Permana
{"title":"Grouping Level of Poverty Based on District/City in Indonesia Using K-Harmonic Means","authors":"Nabillah Putri Shalsabila, N. Amalita, Dodi Vionanda, D. Permana","doi":"10.24036/ujsds/vol1-iss3/60","DOIUrl":"https://doi.org/10.24036/ujsds/vol1-iss3/60","url":null,"abstract":"Indonesia still has a relatively high poverty rate, although nationally it has declined in recent years. There are areas that are still experiencing increasing poverty rates. So that the currently planned poverty alleviation plans are no longer uniform, but need to pay attention to the conditions of each dimension that cause poverty in an area, so it is necessary to group districts/cities in Indonesia on poverty. Grouping was performed using K-Harmonic Means analysis. K-Harmonic Means is a non-hierarchical clustering that takes the average of the harmonic distance between each data point and the cluster’s center. The data used in this research is secondary data sourced from BPS publications on poverty and inequality in 2022. The analysis technique is carried out by standardizing the data, conducting cluster analysis, and validating clusters. Based on the results of the K-Harmonic Means analysis, the optimal number of clusters is two clusters that first cluster has 54 districts/cities while second cluster has 460 districts/cities and the Dunn Index value for cluster validation is 0,03492. So that a better grouping level of poverty based on district/city in Indonesia is obtained by using the K-Harmonic Means method with p = 2,25.","PeriodicalId":220933,"journal":{"name":"UNP Journal of Statistics and Data Science","volume":"81 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129664337","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rizqa Fajriaty, MY Fitri, Dina Fitria, Zilrahmi Syafriandi
{"title":"Vector Error Correction Model for Cointegration Analysis of Factors Affecting Indonesia's Economic Growth during the Pandemic Period","authors":"Rizqa Fajriaty, MY Fitri, Dina Fitria, Zilrahmi Syafriandi","doi":"10.24036/ujsds/vol1-iss3/40","DOIUrl":"https://doi.org/10.24036/ujsds/vol1-iss3/40","url":null,"abstract":"Pertumbuhan ekonomi yang stabil merupakan tujuan akhir dari kebijakan moneter yang dilihat dari kestabilan rupiah. Keadaan Ekonomi mengalami penurunan akibat penyebaran Covid-19. Dalam upaya menstabilkan perekonomian, dianalisis hubungan faktor pendukung pertumbuhan ekonomi Indonesia menggunakan pendekatan VECM. Pendekatan ini dapat menentukan hubungan jangka panjang dan jangka pendek data deret waktu. Hasil pemodelan setelah memenuhi beberapa pengujian, didapatkan tiga persamaan yang signifikan. Model tersebut menjelaskan adanya pengaruh dalam jangka pendek variabel inflasi dan BI rate terhadap inflasi serta pengaruh terbalik antara BI rate satu periode sebelumnya terhadap kurs. Koefisien kointegrasi bernilai negatif menunjukkan adanya mekanisme penyesuaian jangka pendek ke jangka panjang yang terjadi pada variabel inflasi. Dua persamaan kointegrasi untuk jangka panjang menunjukkan bahwa untuk jangka panjang inflasi dapat dipengaruhi variabel visa secara positif. Variabel BI rate dalam jangka panjang dipengaruhi variabel kurs dan visa. VECM yang dihasilkan dapat menjelaskan lebih dari 50% variabel.","PeriodicalId":220933,"journal":{"name":"UNP Journal of Statistics and Data Science","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127334802","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The SMOTE Application of CART Methods for Coping Imbalanced Data in Classifying Status Work on Labor Force in the City of Padang","authors":"A. Yulianti, F. Fitri, N. Amalita, Dodi Vionanda","doi":"10.24036/ujsds/vol1-iss3/12","DOIUrl":"https://doi.org/10.24036/ujsds/vol1-iss3/12","url":null,"abstract":"Employment issues are one of the main concerns in every country, especially in developing countries including Indonesia. Employment problems faced by Indonesia are the lack of job opportunities, excess labor, and the uneven distribution of labor. This is because the growth of the labor force is higher than the growth of existing job opportunities, so that many workers do not get jobs which will cause unemployment. The city of Padang is the city that has the highest unemployment rate in West Sumatra from 2013 to 2021. The development of a smart city and identification of factors that influence unemployment is one of the efforts to reduce unemployment. This study uses the CART method to determine the factors that affect the number of the workforce in the city of Padang. The advantage of the CART method is that it is easy to interpret the results of the analysis, but the accuracy of the classification tree is low due to data imbalance. Therefore, this study uses the SMOTE method to overcome these problems. The optimal classification tree is formed from 8 terminal nodes and involves 4 explanatory variables consisting of marital status (X3), education level (X4), gender (X2) and age(X1), 5 terminal nodes which classify the labor force into the working category and 3 terminal nodes which classify the labor force into the unemployed category.","PeriodicalId":220933,"journal":{"name":"UNP Journal of Statistics and Data Science","volume":"179 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133368563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rizqia Salsabila, A. A. Putra, N. Amalita, F. Fitri
{"title":"Analysis of Factors Influencing the Population Growth Rate in West Sumatra Using Geographically Weighted Logistic Regression","authors":"Rizqia Salsabila, A. A. Putra, N. Amalita, F. Fitri","doi":"10.24036/ujsds/vol1-iss3/59","DOIUrl":"https://doi.org/10.24036/ujsds/vol1-iss3/59","url":null,"abstract":"The model of Geographically Weighted Logistic Regression (GWLR) was the development of a model of logistic regression that was implemented to data in spatial. GWLR model parameter estimation was carried out at each location for observation using spatial weighting. The research purposes was to reveal the GWLR model on the dichotomous data of the Population Growth Rate (PGR) indicator in each Districts/Cities in West Sumatra in 2020 and learn more factors that influence the probability that the population growth rate will increase in 19 Districts/Cities in West Sumatra in 2020. The parameters estimation of the GWLR model uses the Maximum Likelihood Estimation (MLE) method. Spatial weighting for parameter estimation is determined using the Fixed Gaussian Kernel weighting function and determining the optimal bandwidth using Akaike's Information Citerion (AIC) criteria. The variable of response that is categorical in this study is the rate of population growth in each districts/cities in West Sumatra in 2020 and the predictor variables are the couples number of childbearing age, the live births number, the in-migration number, and the out-migration number. The reseacrh result obtained from research were that the GWLR model is better than the logistic regression model and 4 groups of Districts/Cities are formed based on factors that affect the increase in population growth rate.","PeriodicalId":220933,"journal":{"name":"UNP Journal of Statistics and Data Science","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128883594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Modeling Human Development Index in Papua and West Sumatera with Multivariate Adaptive Regression Spline","authors":"Y. Pertiwi, D. Permana, N. Amalita, Admi Salma","doi":"10.24036/ujsds/vol1-iss3/54","DOIUrl":"https://doi.org/10.24036/ujsds/vol1-iss3/54","url":null,"abstract":"The Human Development Index (HDI), which is an indicator as a comprehensive description of the measure of achievment development in the quality of human life in a region. In Indonesia, there are still many areas with low HDI, especially in Papua Province. This study aims to model and find out what factors affect HDI in Papua Province and West Sumatera Province, using Multivariate Adaptive Regression Spline (MARS). MARS is a modeling methods that can handle high-dimensional data, namely data that has independent variables and a sampel size of data with unknown data patterns in advance, and can be applied to see interaction between the variables used. The result of this study obtained that the best MARS model for Papua Province is a combination of BF=24, MI=2, and MO=0 with GCV=0.55953. while the best MARS model for West Sumatera Province ia a combination of BF=24, Mi=2, and MO=0 with GCV=0.02697. the factors that significantly affect HDI in Papua Province and West Sumatera Province are average lenght of schooling (X2), adjusted per-capita income (X6), life expectancy (X1), percentage of poor population (X4), anf gross regional domestic product (X3). The level of importance of each variable for Papua Province is 100%, 45.26%, 29.24%, 6.55%, and 6.27%. while for West Sumatera Province it is 100%, 96.73%, 57.54%, 34.13%, and 29.6%,respectively. So that in this case based on research results the average lenght schooling (X2) is the variable that most influences HDI in the two regions.","PeriodicalId":220933,"journal":{"name":"UNP Journal of Statistics and Data Science","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128908821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Silvia Agustina, F. Fitri, Dodi Vionanda, Admi Salma
{"title":"Rainfall Forcasting in Medan City Using Singular Spectrum Analysis (SSA)","authors":"Silvia Agustina, F. Fitri, Dodi Vionanda, Admi Salma","doi":"10.24036/ujsds/vol1-iss3/52","DOIUrl":"https://doi.org/10.24036/ujsds/vol1-iss3/52","url":null,"abstract":"Singular spectrum analysis is a time series analysis that can be used for data that has seasonal effects. Rainfall is one example that has a seasonal effect. High rainfall has an impact on natural disasters such as floods. Medan city is the capital city of North Sumatra province which has quite high rainfall and is a lowland area, so it has the potential for flooding. Rainfall forecasting can be done as disaster mitigation. The forecasting method used is SSA. The MAPE forecasting accuracy value obtained is 15.7% and the tracking signal is within tolerance limits so that it can be concluded that the forecasting is done well.","PeriodicalId":220933,"journal":{"name":"UNP Journal of Statistics and Data Science","volume":"127 8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128022321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Muhammad Tibri Syofyan, N. Amalita, Dodi Vionanda, Dina Fitria
{"title":"Comparison of Distance Function in K-Nearest Neighbor Algorithm to Predict Prospective Customers in Term Deposit Subscriptions","authors":"Muhammad Tibri Syofyan, N. Amalita, Dodi Vionanda, Dina Fitria","doi":"10.24036/ujsds/vol1-iss3/47","DOIUrl":"https://doi.org/10.24036/ujsds/vol1-iss3/47","url":null,"abstract":"Data mining is often used to analysis of the big data to obtain new useful information that will be used in the future. One of the best algorithms in data mining is K-Nearest Neighbor (KKN). K-NN classifier is a distance-based classification algorithm. The distance function is a core component in measuring the distance or similarity between the tested data and the training data. Various measure of distance function exist make this a topic of kind literature problems to determining the best distance function for the performance of the K-NN classifier. This study aims to compare which distance function produces the best K-NN performance. The distance function to be compared is the Manhattan distance and Minkowski distance. The application of K-NN classifier using bank dataset about predict prospective customers in Term Deposit Subscriptions. This study show that Minkowski distance on K-NN algorithm achieved the best result compared to Manhattan distance. Minkowski distance with power p = 1.5 produces an accuracy rate of 88.40% when the K value is 7. Thus, performance of K-NN algorithm using Minkowski distance (p=1,5, K=7) is best algorithm in predicting prospective costumers in Term Deposit Subscription","PeriodicalId":220933,"journal":{"name":"UNP Journal of Statistics and Data Science","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125037372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Grouping The Regencies/Cities in Indonesia Based on Expenditure Groups Inflation Value Using DBSCAN Method","authors":"Meliani Putri, Dony Permana, Syafriandi Syafriandi, Zilrahmi","doi":"10.24036/ujsds/vol1-iss3/61","DOIUrl":"https://doi.org/10.24036/ujsds/vol1-iss3/61","url":null,"abstract":"Inflation is one of the important problems faced by a country in achieving economic goals and targets. The amount of inflation is measured using the Consumer Price Index for eleven expenditure groups in 90 regencies/cities in Indonesia. The occurrence of differences in inflation rates between regencies/cities in Indonesia will affect Indonesia's national inflation. The purpose of this research is to grouping regencies/cities based on expenditure groups inflation value and to identify the characteristics of the resulting groups. DBSCAN is a density-based non-hierarchical cluster method that can be used in data conditions that contain outliers. The data used in this study is secondary data obtained from the publication of the Badan Pusat Statistik Republic of Indonesia (BPS RI) regarding inflation by expenditure group. The analysis includes outlier detection, grouping using the DBSCAN method, performing cluster validation with silhouette coefficient, and identifying the characteristics of the clusters formed. Based on the grouping that has been done, two clusters are produced with a silhouette coefficient value of 0.65. The resulting cluster is cluster 0 in the form of a noise cluster consisting of 3 regencies/cities with regencies/cities that have a high category expenditure group inflation rate. Cluster 1 consisting of 87 regencies/cities is a cluster with regencies/cities that have a low category expenditure group inflation rate.","PeriodicalId":220933,"journal":{"name":"UNP Journal of Statistics and Data Science","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134520440","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Amelia Fadila Rahman, Syafriandi, N. Amalita, Zilrahmi
{"title":"Geographically Weighted Panel Regression Modeling on Human Development Index in West Sumatra","authors":"Amelia Fadila Rahman, Syafriandi, N. Amalita, Zilrahmi","doi":"10.24036/ujsds/vol1-iss3/63","DOIUrl":"https://doi.org/10.24036/ujsds/vol1-iss3/63","url":null,"abstract":"The Human Development Index (HDI) is an important issue that has a negative impact on the field of human development and people's welfare in West Sumatra Province. An effort to overcome the problem of the HDI is to identify the influencing factors. A method that can be used to identify influencing factors and explain the influence of characteristic areas of observation is Geographically Weighted Panel Regression (GWPR). GWPR is a combination of panel data regression method with GWR which is used when the data has the influence of spatial heterogeneity. The purpose of this study is to form a GWPR model that will be applied to the HDI in Regencies/Cities in West Sumatra from 2019 to 2022. Modeling using GWPR Fixed Effect Model. The weigher function used is a fixed exponential kernel with a minimum CV of 0.00208. The results showed that the model obtained had an of 99.9%, which means the predictor variable was able to explain the model by 99.9%. Variables that have a significant on HDI are Life Expectancy, Expected Years of Schooling, Mean Years of Schooling, and Purchasing Power Parity. \u0000 ","PeriodicalId":220933,"journal":{"name":"UNP Journal of Statistics and Data Science","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133751256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gezi Fajri, S. Syafriandi, N. Amalita, Zamahsary Martha
{"title":"Comparison of Queen Contiguity and Customized Weighting Matrices on Spatial Regression to Identify Factors Impacting Poverty in East Java","authors":"Gezi Fajri, S. Syafriandi, N. Amalita, Zamahsary Martha","doi":"10.24036/ujsds/vol1-iss3/67","DOIUrl":"https://doi.org/10.24036/ujsds/vol1-iss3/67","url":null,"abstract":"Poverty is crucial problem that negative impact on all sectors, including economic, social, and cultural development in East Java Province. Poverty can also increase unemployment, crime, trigger social disasters and hinder progress East Java province. One efforts overcome problem of poverty in East Java province is detect factors that influence. This effort can be done through statistical modeling to determine factors that influence poverty in East Java province. statistical model that can identify factors that influence poverty and explain relationship between region and surrounding area is spatial regression analysis. In spatial regression analysis, spatial weighting matrix is needed to determine spatial influences between regions where one region influences neighboring regions. spatial weighting matrices that is often used is queen contiguity, and according to Anselin (1988:20), this spatial weighting also considers initial information, purpose of case studied, and theory underlying the research. This weighting uses social and economic variables case under study, namely customized weighting matrix. Based on results of this study, shows that best spatial regression and spatial weighting models are General Spatial Model (GSM) with customized weighting because customized weighting produces better estimation results than SAR, SEM and GSM models with queen contiguity weighting in district and city poverty modeling in East Java province with Akaike Infomation Criterion (AIC) value of 188.77 and detemination coefficient (R2) of 84.95%. School's Expected Time, Life Expectancy Score, and Employment Participation Rate are factors that will have substantial impact on percentage of population living in poverty East Java's districts and cities in 2021.","PeriodicalId":220933,"journal":{"name":"UNP Journal of Statistics and Data Science","volume":"144 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134542091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}