Caecilia Bintang Girik Allo, L. Putra, N. R. Paranoan, Vincentius Abdi Gunawan
{"title":"Comparing Logistic Regression and Support Vector Machine in Breast Cancer Problem","authors":"Caecilia Bintang Girik Allo, L. Putra, N. R. Paranoan, Vincentius Abdi Gunawan","doi":"10.34312/jjps.v4i1.19246","DOIUrl":"https://doi.org/10.34312/jjps.v4i1.19246","url":null,"abstract":"There are several methods used for the classification problems. There are many different kinds of fields that can be used. Nowadays, Support Vector Machine (SVM) is a popular classification method that has been proposed by many researchers. Using the same method but different distribution methods for creating training and testing data in the same dataset can yield varying results in terms of prediction accuracy, which is crucial in classification. In this paper, we compare the prediction accuracy between SVM results and Logistic Regression results to determine the better method to classify the current condition of the patient after undergoing some treatment. Several treatments are used in this paper, including feature selection, feature extraction, separating the train and testing data using Holdout and K-Fold CV. Stepwise selection is done to reduce the features. Training and testing dataset is obtained using the five stratified and non-stratified holdout and five fold stratified and non-stratified cross validation. The result shows that the best method to classify the cancer dataset is five fold stratified cross validation SVM with radial kernel. The obtained accuracy is 81,816% with variance as much as 0,94%.","PeriodicalId":315674,"journal":{"name":"Jambura Journal of Probability and Statistics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123962101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Perbandingan Fuzzy Time Series Lee untuk Meramalkan Nilai Tukar Petani di Provinsi Gorontalo","authors":"Alvitha Habibie, Lailany Yahya, Isran K. Hasan","doi":"10.34312/jjps.v4i1.17453","DOIUrl":"https://doi.org/10.34312/jjps.v4i1.17453","url":null,"abstract":"Gorontalo Province is one of the provinces in Indonesia where 60% of the population are farmers and fishermen. As much as 28,66% of PDRB in Gorontalo Province in 2020 was contributed by the agricultural sector. Farmer's Exchange Rate is a measurement capability of agricultural products in producing goods or services. Therefore, NTP forecasting is needed so that it becomes a reference in the future in making a decision to increase the agricultural sector. In this study, a comparison was made of the Holt Winters Exponential Smoothing method with Lee's Fuzzy Time Series to find out which is the best forecasting method for predicting NTP in Gorontalo Province. Based on the forecasting results, the accuracy value obtained from FTS Lee has a mape value of 0,65557% for FTS Lee order 1 and 0,55607%. While the accuracy value obtained by the multiplicative Holt Winters Exponential Smoothing is 5.92509% and the additive Holt Winters Exponential Smoothing is 6,14574%. From the forecasting results obtained, it can be concluded that the best method for predicting NTP in Gorontalo Province is the FTS Lee Order 2 method. ","PeriodicalId":315674,"journal":{"name":"Jambura Journal of Probability and Statistics","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121980927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lutfi Sivana Ihzaniah, Adi Setiawan, R. W. N. Wijaya
{"title":"Perbandingan Kinerja Metode Regresi K-Nearest Neighbor dan Metode Regresi Linear Berganda pada Data Boston Housing","authors":"Lutfi Sivana Ihzaniah, Adi Setiawan, R. W. N. Wijaya","doi":"10.34312/jjps.v4i1.18948","DOIUrl":"https://doi.org/10.34312/jjps.v4i1.18948","url":null,"abstract":"This research was made in order to see which method performance is better between the KNN (K-Nearest Neighbor) regression method and the multiple linear regression method on Boston Housing data. The method performace referred here is MAE, RMSE, MAPE, and R2. The KNN method is a method to predict something based on the closest training examples of an object. Meanwhile, multiple linear regression is a forecasting technique involving more than one independent variable. The comparison of the two methods is based on the results of the Mean Absolute Percent Error (MAPE). In this research the definitions of distance used are Euclidean distance and Minkowski distance. The K value in the KNN method defines the number of nearest neighbors to be examined to determine the value of a dependent variable, in this research we use K values from 1 to 10 for each test data and definition of distance. In this research, the percentage of test data used was 20%, 30%, and 40% for both methods. The best MAPE value obtained by the KNN regression method was 12,89% at K = 3 for Euclidean distance and 13,22% at K = 3 for Minkowski distance. Meanwhile the best MAPE value for the multiple linear regression method is 17,17%. The best method between the two methods is the KNN regression method as seen from the MAPE value of the KNN regression method which is smaller than the MAPE value of the multiple linear regression method.","PeriodicalId":315674,"journal":{"name":"Jambura Journal of Probability and Statistics","volume":"159 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134264272","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rahmah Abubakar, M. Rifandi, R. Rahmawati, F. Fatimah
{"title":"Analisis Faktor-Faktor Penghambat Penyelesaian Studi Mahasiswa Program Studi Matematika Universitas Sulawesi Barat Menggunakan PLS-SEM","authors":"Rahmah Abubakar, M. Rifandi, R. Rahmawati, F. Fatimah","doi":"10.34312/jjps.v4i1.19240","DOIUrl":"https://doi.org/10.34312/jjps.v4i1.19240","url":null,"abstract":"This research aims to examine the factors that influence the completion of students' studies at the University of West Sulawesi Mathematics Study Program. The high frequency of alumni with a length of study above the expected time is a polemic that needs to be solved and followed up. The research method analyses the partial least squares structural equation model (PLS-SEM). This study involved 2016, 2017 and 2018, batch students. Data collection used an online questionnaire. The results showed that the self-control factor and the intelligence and interest factor had a significant effect on students' motivation to complete their study on time. On the other hand, environmental factors and campus instrument factors do not have a significant effect.","PeriodicalId":315674,"journal":{"name":"Jambura Journal of Probability and Statistics","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133874311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Prediksi Jumlah Wisatawan Menggunakan Metode Random Forest, Single Exponential Smoothing dan Double Exponential Smoothing di Provinsi NTB","authors":"Ristu Haiban Hirzi, Umam Hidayaturrohman, Kertanah Kertanah, M. Amaly, Rody Satriawan","doi":"10.34312/jjps.v4i1.17088","DOIUrl":"https://doi.org/10.34312/jjps.v4i1.17088","url":null,"abstract":"The aim of study is to forecast global tourist visits and compare the forecasting methods to determine the best method using random forest, single exponential smoothing and double exponential smoothing, respectively. These methods are applied in global tourist visit data in West Nusa Tenggara Province. Random forest, single exponential smoothing and double exponential smoothing are familiar methods and are frequently utilized in forecasting. In addition, the three methods have great accuracy for time series data, such as data of global tourist visits. The data used in this study is data of global tourist visits from 2014 to 2021 in West Nusa Tenggara province. Applying the random forest, single exponential smoothing and double exponential smoothing methods in forecasting, the result shows that double exponential smoothing method is the best, based on the smallest value of Mean Absolute Percentage Error (MAPE) of 325.759. The forecasting result found out that tourist visits will increase from previous time, starting from August, 2021 to July, 2021 with an estimated 847 to 1045 lives","PeriodicalId":315674,"journal":{"name":"Jambura Journal of Probability and Statistics","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114598723","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Implementasi Metode Runtun Waktu dalam Pemodelan Total Harga Alat Kedokteran dan Kesehatan","authors":"Daniar Wahyu Laraswati, Achmad Fauzan","doi":"10.34312/jjps.v4i1.17873","DOIUrl":"https://doi.org/10.34312/jjps.v4i1.17873","url":null,"abstract":"Hospital is an institution or health service that provides total individual health care by providing outpatient, inpatient, and emergency services. The health services that will be provided are promotive, preventive, and rehabilitative services. One of the efforts to improve the quality of hospital services is to provide good health services. In terms of supporting the health services provided, a health management is needed. The high price of medical supplies and equipment is due to several other factors causing hospitals to be able to make plans in the procurement of medical equipment and hospital medicine. Therefore, the author uses the Autoregressive Moving Average (ARMA) method in this study to predict the Total Price of Medical and Health Equipment Needed at the Sleman Regional General Hospital in the coming period. Based on the analysis that has been found, one significant and best ARMA model is obtained with the AIC value of 223.92 with equation and the MAPE value of 18.78%, which means the accuracy of the forecasting is 81.22%.","PeriodicalId":315674,"journal":{"name":"Jambura Journal of Probability and Statistics","volume":"232 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132317253","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"PEMODELAN PENYAKIT INFEKSI SALURAN PERNAFASAN AKUT DI DAERAH SEKITAR SEMBURAN LUMPUR LAPINDO SIDOARJO DENGAN PENDEKATAN MODEL MULTIVARIATE ADDITIVE REGRESSION SPLINE","authors":"Mahfudhotin Mahfudhotin","doi":"10.34312/jjps.v3i2.16696","DOIUrl":"https://doi.org/10.34312/jjps.v3i2.16696","url":null,"abstract":"The phenomenon of hot mudflow in Sidoarjo is interesting to be investigated further. Regarding the cause, the disaster occurred due to drilling errors resulting in the Lapindo mudflow which resulted in gas emissions causing health problems, especially those related to the respiratory tract, namely respiratory tract infections (ARI). Risk factors that can affect the incidence of ARI in general are socio-demographic, biological, housing and density factors and pollution. Therefore, this study aims to obtain a model for classifying ARI patient data in the Jabon, Tanggulangin, and Porong sub-districts, Sidoarjo district with the variables that contribute to the classification. The nonparametric approach Multivariate Adaptive Regression Spline (MARS) was chosen because several previous studies stated that this method resulted in a higher classification accuracy than other classification methods. In addition, MARS is a classification method that is able to form a model with causal interactions to produce the best MARS model obtained from a combination of Maximum Interaction (MI), Basis Function (BF), and Minimum Observation (MO) values. The results of modeling with MARS there are three variables that contribute to the grouping, namely the percentage of the distance between the house and the source of the Lapindo mudflow, the number of activities outside the house, and the number of house ventilation. The overall model classification accuracy is 97,4 percent with a GCV value of 0,096 and an R2 of 82,9 percent ","PeriodicalId":315674,"journal":{"name":"Jambura Journal of Probability and Statistics","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121109427","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nurina Salsabila, Sri Wahyuningsih, I. Purnamasari
{"title":"PEMODELAN VECTOR AUTOREGRESSIVE EXOGENOUS (VARX) UNTUK MERAMALKAN DATA EKSPOR TOTAL DAN IMPOR TOTAL DI INDONESIA","authors":"Nurina Salsabila, Sri Wahyuningsih, I. Purnamasari","doi":"10.34312/jjps.v3i2.15527","DOIUrl":"https://doi.org/10.34312/jjps.v3i2.15527","url":null,"abstract":"Vector Autoregressive Exogenous (VARX) is a multivariate time series model which is a development of the Vector Autoregressive (VAR) model. VARX model is a forecasting model that involves endogenous variables and exogenous variables. The endogenous variables in this study are exports and total imports in Indonesia, then the exogenous variable in this study is the composite stock price index in Indonesia. The purpose of this study is to VARX model the export and total import data in Indonesia for the period January 2016 to December 2021 and predict it for the period January 2022 to December 2022. Based on the result of the analysis, the best model for forecasting export and total imports is the VARX(2.2) model with the MAPE value for the total export variable of 5.938% and the total import variable of 8.313%. Furthermore, the results of forecasting total exports have increased in the period January 2022 to December 2022, with forecasting results for January 2022 of US$21,383.06 million and December 2022 of US$23,569.50 million. The results of forecasting total imports have increased in the period January 2022 to December 2022, with forecasting results in January 2022 of US$17,743.17 million and December 2022 of US$20,269.07 million.","PeriodicalId":315674,"journal":{"name":"Jambura Journal of Probability and Statistics","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126891609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nur Sakinah, W. Saputra, Nurfitra Nurfitra, S. Satriani, J. Junaidi
{"title":"ANALISIS SPASIAL PENYEBARAN PENYAKIT SCHISTOSOMIASIS MENGGUNAKAN INDEKS MORAN UNTUK MENDUKUNG ERADIKASI SCHISTOSOMIASIS DI PROVINSI SULAWESI TENGAH BERBASIS WEB DASHBOARD","authors":"Nur Sakinah, W. Saputra, Nurfitra Nurfitra, S. Satriani, J. Junaidi","doi":"10.34312/jjps.v3i2.16580","DOIUrl":"https://doi.org/10.34312/jjps.v3i2.16580","url":null,"abstract":" Schistosomiasis is a parasitic disease which is caused by worm infection with worms from the Schistosoma class. This disease is zoonotic, consequently the source of transmission is not only infected on mammals but also on humans. The method used in this study is spatial autocorrelation. This is conducted to determine the presence or absence of global or local spatial autocorrelation as well as the pattern distribution of Schistosomiasis cases in Poso Regency by using Moran's I. The result in this study showed that the p-value of positive global autocorrelation is 2,2 × 10-16. This result is smaller than the 5% of significance level and also smaller than the Moran's I value (0,66). The Moran’s I value lies in the interval indicating that each adjacent area has the same number of Schistosomiasis cases. Meanwhile, the local spatial autocorrelation test (LISA) for Schistosomiasis cases in Poso Regency, such as villages at Lore Utara, Lore Timur and Lore Peore has the LISA value 1 determining the correlation is strong and positive. The distribution pattern of Schistosomiasis cases in Poso Regency forms a group pattern, namely disease prone areas (HH), disease spread areas (HL), disease alert areas (LH) and disease safe areas (LL) ","PeriodicalId":315674,"journal":{"name":"Jambura Journal of Probability and Statistics","volume":"281 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121042782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ANALISIS SENTIMEN MASYARAKAT PADA KEBIJAKAN VAKSINASI COVID-19 DI TWITTER MENGGUNAKAN METODE MESIN VEKTOR PENDUKUNG DENGAN KERNEL RADIAL BASIS FUNCTION BERBASIS FITUR LEKSIKON","authors":"S. Mulyani, Sri Astuti Thamrin, S. Siswanto","doi":"10.34312/jjps.v3i2.16663","DOIUrl":"https://doi.org/10.34312/jjps.v3i2.16663","url":null,"abstract":"Twitter is one of the popular social media used to get news quickly and briefly. After the outbreak of the COVID-19 virus and the government's policy to vaccinate against COVID-19 in Indonesia, more and more public opinion has been expressed through tweets. This makes the topic of COVID-19 vaccination interesting for sentiment analysis. Through sentiment analysis, information in the form of text data can be extracted to classify information related to positive or negative opinions. In this study, the classification of public opinion on COVID-19 vaccination was carried out using the supporting vector machine method without and with lexicon-based features. The manual labeling data used were 2981 tweets. The results of the classification of public opinion on COVID-19 vaccination in Indonesia with a supporting vector machine without the lexicon feature obtained accuracy, g-mean and AUC of 83%, 50% and 61.35%, respectively. Meanwhile, with lexicon-based features, the performance of the supporting vector machine method for classifying public opinion on COVID-19 vaccination in Indonesia obtained accuracy, g-mean and AUC of 90%, 86.63% and 87%, respectively. Based on these results, the performance of the supporting vector machine method with lexicon-based features provides better results for the performance of classifying of public opinion on COVID-19 vaccination compared to supporting vector machines without lexicon-based features.","PeriodicalId":315674,"journal":{"name":"Jambura Journal of Probability and Statistics","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116579036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}