{"title":"Nonparametric Regression Modeling with Fourier Series Approach on Poverty Cases in West Sumatra Province","authors":"Melin Wanike Ketrin, Fadhilah Fitri, Atus Amadi putra, Zilrahmi","doi":"10.24036/ujsds/vol1-iss2/32","DOIUrl":"https://doi.org/10.24036/ujsds/vol1-iss2/32","url":null,"abstract":"Poverty is a complex problem that has an impact on various social problems such as education, unemployment, health and economic growth. Therefore, the problem of poverty is important to overcome in order to create population welfare. One of the analyses that can be used to model the percentage of poverty is regression analysis. Regression analysis is divided into two approaches, namely parametric and nonparametric. Parametric regression has several assumptions while, the only assumption nonparametric regression shape of the curve does not form a certain pattern. There are several approaches to nonparametric regression, one of which is the Fourier Series. The purpose of this study is to model the percentage of poverty in West Sumatra Province. The unclear shape of the curve in the data used is a consideration for using nonparametric regression. Then it is known that the data used in this study is data per region which tends to have a fluctuating nature. So it is suitable to use the Fourier series approach. In this research, nonparametric regression modeling with one, two, and three oscillation parameters was attempted. The best model was obtained which consisted of two oscillation parameters with a Generalized Cross Validation (GCV) value of 2.110 and R² of 92.44%.","PeriodicalId":220933,"journal":{"name":"UNP Journal of Statistics and Data Science","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123833001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mohammad Reza febrino, Dony Permana, Syafriandi, Nonong amalita
{"title":"Comparison of Forecasting Using Fuzzy Time Series Chen Model and Lee Model to Closing Price of Composite Stock Price Index","authors":"Mohammad Reza febrino, Dony Permana, Syafriandi, Nonong amalita","doi":"10.24036/ujsds/vol1-iss2/22","DOIUrl":"https://doi.org/10.24036/ujsds/vol1-iss2/22","url":null,"abstract":"Investment is an activity to invest with the hope that someday you will get a number of benefits from theinvestment result. In investing, analyzing is important to see the current situation and condition of stock. Investorscan forecast stock prices by looking at trends based on data movements from stock prices in the past. Fuzzy TimeSeries (FTS) was used in this study to forecast. Fuzzy time series is a forecasting technique that uses patterns frompast data to project future data in areas where linguistic values are formed in the data. This study compares theclosing price of composite stock forecasting using the fuzzy time series chen and lee models. The JCI's closing pricefor the following period is 6,904 and has a Mean Absolute Percentage Error (MAPE) of 4.03%, according to the chenfuzzy time series method. In contrast, utilizing Lee's fuzzy time series method, the predicted JCI closing price for thefollowing period is 7,046, with a MAPE value of 3.10 percent. It can be concluded from the forecasting results of theChen and Lee methods that the Lee model FTS is superior to the Chen model FTS in predicting the JCI closing price.","PeriodicalId":220933,"journal":{"name":"UNP Journal of Statistics and Data Science","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127678846","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Comparison of Naive Bayes Method and Binary Logistics Regression on Classification of Social Assistance Recipients Program Keluarga Harapan (PKH)","authors":"Fanni Rahma Sari, Fadhilah Fitri, Atus Amadi putra, Dony Permana","doi":"10.24036/ujsds/vol1-iss2/24","DOIUrl":"https://doi.org/10.24036/ujsds/vol1-iss2/24","url":null,"abstract":"Population density is one of the causes of economic inequality in society. One of the solutions provided by the government is to distribute social assistance. In 2007 the government created a social assistance program called the “Program Keluarga Harapan” (PKH) with the aim of alleviating poverty. There are several problems in the distribution of social assistance, one of which is receiving aid that is not right on target. Therefore, an appropriate method is needed in classifying the recipients of social assistance properly. This study will use two methods, namely Naive Bayes and Binary Logistic Regression to compare which method is better on the data used. The data used is the DTKS data for PKH assistance recipients in the Anduring Village in 2020. Based on the results obtained, the accuracy of the Naive Bayes method is 70% and Binary Logistic Regression is 73%. So the best method in measuring classification is Binary Logistic Regression.","PeriodicalId":220933,"journal":{"name":"UNP Journal of Statistics and Data Science","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115034976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Application of Random Forest to Identify for Poor Households in West Sumatera Province","authors":"Febri Ramayanti, Dodi Vionanda, Dony Permana, Zilrahmi","doi":"10.24036/ujsds/vol1-iss2/31","DOIUrl":"https://doi.org/10.24036/ujsds/vol1-iss2/31","url":null,"abstract":"Poverty is a socioeconomic problem in Indonesia. The number of people who were living in poverty in West Sumatera increases for 26.44 thousands from 2020 to 2021. The government has created programs to cope with poverty by taking into account the criteria for the poor households. These criteria have been developed by using the data obtained through The National Socioeconomic Survey (Susenas). However, instead of.showing the actual location of poor household, the existing data only interprets the number of poor household. Thus make the program less effective. This could be overcome by classification analysis of random forest (RF). RF is collection of many decision trees. Before fitting RF, one has to determine the values if three tuning parameters, mtry, ntree and node size. The result are the smallest OOB’s error rate (%) and Variable Importance Measure(VIM). The classification by RF in this research results in OOB’s error rate was 5.65% or accuracy rate was 94.35% with tuning parameter using mtry=5 and ntree=500. Based on the VIM, the poor household’s criteria include sources of drinking water such as protected or unprotected spring water and surface water, lighting tools such as non-PLN electricity or no usage of electricity, fuel for cooking such as charcoal and firewood, and the head of the household being self-employed, a family worker, or unpaid with at least a junior high degree.","PeriodicalId":220933,"journal":{"name":"UNP Journal of Statistics and Data Science","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134426688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multivariate Adaptive Regression Spline Method for Study Timeliness of the 2017 FMIPA UNP Student","authors":"Rahmadani Iswat, Fadhilah Fitri, Atus Amadi putra, Zilrahmi","doi":"10.24036/ujsds/vol1-iss2/23","DOIUrl":"https://doi.org/10.24036/ujsds/vol1-iss2/23","url":null,"abstract":"The punctuality of study is the time period to complete an education, for undergraduate students is 4 years. One of the quality’s determining of higher education is students’ ability to complete their education on time. The purpose of this study is to see the best modeling results and the accuracy of the punctuality of study of class 2017 FMIPA UNP undergraduate students using MARS. MARS is a method of multivariate nonparametric regression between response variables and predictor variables. The type of research used is applied research. The predictor variables used in this study are Grade Point Average (GPA), gender, university entrance, major, school origin status and place of origin. While the response variable is punctuality of learning time. The results of trial and error showed that the best model was obtained from a combination (BF = 18, MI = 3 and MO = 2), with a minimum GCV value of 0.23182 and R2 value of 0.10045. From the model, it can be seen that the factors that significantly affect punctuality of learning time for FMIPA UNP students class 2017 are the X4 (majors) with an importance level of 100%, the X1 (GPA) with an importance level of 96.61%, X3 (university entrance) and the X5 (school origin status) with an importance level of 16.78 %. The classification accuracy on the 2017 student study timeliness is 64% based on graduating on time and not on time, with a classification error rate of 36%.","PeriodicalId":220933,"journal":{"name":"UNP Journal of Statistics and Data Science","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130213810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Belia Mailien, Admi Salma, Syafriandi, Dina Fitria
{"title":"Comparison K-Means and Fuzzy C-Means Methods to Grouping Human Development Index Indicators in Indonesia","authors":"Belia Mailien, Admi Salma, Syafriandi, Dina Fitria","doi":"10.24036/ujsds/vol1-iss1/4","DOIUrl":"https://doi.org/10.24036/ujsds/vol1-iss1/4","url":null,"abstract":"The Human Development Index (HDI) is an important indicator to measure the success of efforts to improve people's quality of life. The increase in the human development index in Indonesia is not accompanied by an even distribution of the human development index in every district/city in Indonesia. To facilitate the government in making policies and plans in overcoming the uneven HDI in Indonesia, it is necessary to group districts/cities in Indonesia based on HDI indicators. This study discusses the use of the K-means and Fuzzy C-Means algorithms with a total of 4 clusters. The grouping results obtained summarize that most districts/cities in Papua Island have low HDI indicators. The achievement of the HDI indicator in the medium category on the K-Means and Fuzzy C-Means methods is the same, spread across all major islands in Indonesia. However, the Nusa Tenggara Islands generally have a medium HDI indicator achievement. The achievements of the HDI indicators with high categories in the K-Means and Fuzzy C-Means methods are mostly found on the islands of Sumatra, Java, Kalimantan, and Sulawesi. The achievement of the HDI indicator in the very high category in the K-Means and Fuzzy C-Means methods is found in provincial capitals in several provinces in Indonesia as well as in big cities in Indonesia. The results of this study indicate that the S_DBW index and C_index values of the Fuzzy c-means method are smaller than the K-Means method, namely 2.312 and 0.105.","PeriodicalId":220933,"journal":{"name":"UNP Journal of Statistics and Data Science","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121447104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Aprilla Suhada, Syafriandi, Dodi Vionanda, F. Fitri
{"title":"Modeling Open Unemployment Rate in West Sumatera Province Using Truncated Spline Regression","authors":"Aprilla Suhada, Syafriandi, Dodi Vionanda, F. Fitri","doi":"10.24036/ujsds/vol1-iss1/3","DOIUrl":"https://doi.org/10.24036/ujsds/vol1-iss1/3","url":null,"abstract":"The Open Unemployment Rate (TPT) is an indicator used to measure the unemployment rate in the labor force which shows the percentage of the number of job seekers to the total workforce. In 2020 West Sumatra Province occupies the eighth position as the largest contributor to unemployment in Indonesia, this is a problem for the West Sumatra Provincial government. To deal with the unemployment problem, it is necessary to analyze the factors that are thought to affect the open unemployment rate in West Sumatra Province using truncated spline regression on the grounds that the data pattern between the response variables and each predictor variable does not form any pattern. Several factors are thought to influence the open unemployment rate, namely population, labor force participation rate, average length of schooling, dependency ratio. Based on the results of the analysis, the best model for modeling the open unemployment rate in West Sumatra Province is the truncated spline regression using three knot points with a GCV value of 0.061762. Variables that have a significant effect are population, labor force participation rate, average length of schooling and dependency ratio with a coefficient of determination of 99.97%.","PeriodicalId":220933,"journal":{"name":"UNP Journal of Statistics and Data Science","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132641805","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hana Rahma Trifanni, D. Permana, N. Amalita, A. A. Putra
{"title":"Time Series Modeling on Stock Return at PT. Telecommunication Indonesia Tbk.","authors":"Hana Rahma Trifanni, D. Permana, N. Amalita, A. A. Putra","doi":"10.24036/ujsds/vol1-iss1/8","DOIUrl":"https://doi.org/10.24036/ujsds/vol1-iss1/8","url":null,"abstract":" One of the time series data modeling is the ARMA model which assumes constant volatility. However, in economic and financial data, there are many cases where volatility is not constant. This results in the occurrence of heteroscedasticity problems in the residuals, so a GARCH model is needed. In addition to heteroscedasticity, another problem with residuals is the asymmetric effect or leverage effect. For that we need asymmetric GARCH modeling. This study aims to compare the accuracy of the ARMA, GARCH, and asymmetric GARCH models. This research is an applied research. The data used is daily stock return data from February 2020 to February 2022 as many as 488 data. The results showed that the best model in modeling stock return volatility is ARMA(0,1). The accuracy of this model is very good with MAD value of 0,0018644 and RMSE value of 0,0025352.","PeriodicalId":220933,"journal":{"name":"UNP Journal of Statistics and Data Science","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128977141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Forecasting Shallot Prices in West Sumatra Province Using the Fuzzy Time Series Method of the Singh Model and the Cheng Model","authors":"Huriati Khaira, F. Fitri, N. Amalita, D. Permana","doi":"10.24036/ujsds/vol1-iss1/7","DOIUrl":"https://doi.org/10.24036/ujsds/vol1-iss1/7","url":null,"abstract":"Shallots are one of the leading spices that are widely used by humans as food seasoning and traditional medicine. The price of shallots always fluctuates which can affect the buying and selling of consumers and producers. Therefore, forecasting is used as a reference to be able to predict the price of shallots in the future and can provide convenience to the public for the condition of shallot prices in the next period. The forecasting method used is the fuzzy time series (FTS) method. FTS is a method whose forecasting uses data in the form of fuzzy sets sourced from real numbers to the universe set on actual data. Forecasting models used in this study are Singh's FTS model and Cheng's model. The data used is monthly data on shallot prices in West Sumatra Province for the period January 2018 to March 2022. The results obtained in this forecast are the Singh model FTS has a smaller MAPE value of 4.41% with a forecasting accuracy value of 95.59 %. This means that Singh's FTS model is better at predicting the price of shallots in West Sumatra Province.","PeriodicalId":220933,"journal":{"name":"UNP Journal of Statistics and Data Science","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116844979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Retsya Lapiza, Syafriandi, N. Amalita, Dina Fitria
{"title":"Grouping The Districts in Sumatera Region Based on Economic Development Indicators Using K-Medoids and CLARA Methods","authors":"Retsya Lapiza, Syafriandi, N. Amalita, Dina Fitria","doi":"10.24036/ujsds/vol1-iss1/13","DOIUrl":"https://doi.org/10.24036/ujsds/vol1-iss1/13","url":null,"abstract":"Inequality in economic development is an economic problem that is often felt by developing countries. In Indonesia, one of the regional areas that has not yet experienced equal distribution of economic development is the regencies/cities of the Sumatera Region. This study aims to determine regional groups and compare the results of grouping with the K-Medoids and CLARA methods. The K-Medoids and CLARA methods are non-hierarchical methods that are strong against outliers. While the best selection method is done by comparing the silhouette coefficient. The results obtained in this study using the K-Medoids and CLARA methods with 2 groups being better than forming 3 groups. The K-Medoids method resulted in cluster 1 as many as 59 districts/cities and cluster 2 as many as 95 districts/cities. Meanwhile, the grouping of districts/cities using the CLARA method with 2 groups resulted in cluster 1 as many as 74 districts/cities and cluster 2 as many as 80 districts/cities. From the comparison of the two methods, the silhouette coefficient values using the K-Medoids and CLARA methods are 0.13 and 0.15 respectively. Therefore, the CLARA method with 2 groups gave better cluster results","PeriodicalId":220933,"journal":{"name":"UNP Journal of Statistics and Data Science","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133082528","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}