Widia Handa Riska, D. Permana, Atus Amadi Putra, dan Zilrahmi
{"title":"Categorical Data Clustering with K-Modes Method on Fire Cases in DKI Jakarta Province","authors":"Widia Handa Riska, D. Permana, Atus Amadi Putra, dan Zilrahmi","doi":"10.24036/ujsds/vol2-iss1/115","DOIUrl":"https://doi.org/10.24036/ujsds/vol2-iss1/115","url":null,"abstract":"In DKI Jakarta Province, the number of fires increases and decreases every year. For this reason, efforts need to be made to prevent and reduce the risk of fire. BPBD DKI Jakarta is responsible for this matter. However, for these efforts to be effective, information is needed regarding fire patterns that frequently occur. Fire patterns can be seen using K-Modes categorical clustering analysis. The data used is fire data in DKI Jakarta in 2018. The optimal number of clusters was obtained as 6 clusters based on the Davies Bouldin Index value with the smallest DBI value is 6,22. Of the six clusters, cluster 3 is the cluster with the highest number of fire cases. Cluster 3 has a centroid, namely that fire cases occurred on Friday, November, in Cakung District, due to an electrical short circuit, burning down residential houses and rarely causing minor injuries, serious injuries or deaths.","PeriodicalId":220933,"journal":{"name":"UNP Journal of Statistics and Data Science","volume":"25 6","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140432087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Aldi Prajela, Syafriandi Syafriandi, Dony Permana, Dina Fitria
{"title":"Twitter Data Sentimen Analysis 2024 Presidential Candidate Using Algorithm Naïve Bayes Classifier By Methods K-Fold Cross Validation","authors":"Aldi Prajela, Syafriandi Syafriandi, Dony Permana, Dina Fitria","doi":"10.24036/ujsds/vol2-iss1/149","DOIUrl":"https://doi.org/10.24036/ujsds/vol2-iss1/149","url":null,"abstract":"Indonesia implements a democratic system by involving the public in General Elections (Pemilu) for specific political positions. The active community expresses opinions on social media, especially regarding the 2024 Presidential Election (Pilpres) and respective presidential candidates, which have become trending topics on Twitter. The analysis used to absorb these tweets into information is sentimen analysis using the Naïve Bayes Classifier algorithm with the K-fold Cross-Validation method. Through the stages of pre-processing, weighting, labeling, classification using NBC, and testing using a Confusion Matrix, The results of the classification from NBC showed that Anies got 80% positive tweets and 20% negative tweets from 1186 tweets, Prabowo Subianto got 78% positive tweets and 22% negative tweets from 1149 tweets, and Ganjar Pranowo got 77% positive tweets and 23% negative tweets from 1075 tweets. Testing process was carried out using the NBC algorithm with the K-Fold Cross Validation method using values k=1 to k=10. The function of K-Fold Cross Validation is to maximize the confusion matrix result. It can be conclude that Anies Baswedan has the highest score in iteration 4, namely a precision value of 90%, a recall value of 99%, and the accurary value of 91%. Furthemore, Ganjar Pranowo had the highest score in iteration 9, namely a precision value of 95%,a recall value of 97%, and an accuracy value of 92%. Meanwhile, Prabowo Subianto had the highest score in iteration 9, namely a precision value of 97%, a recall value of 99%, and an accuracy value of 94%.","PeriodicalId":220933,"journal":{"name":"UNP Journal of Statistics and Data Science","volume":"10 4","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140432181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ade Eriyen Saputri, Admi Salma, N. Amalita, D. Permana
{"title":"Biplot and Procrustes Analysis of Poverty Indicators By Province in Indonesia in 2015 dan 2019","authors":"Ade Eriyen Saputri, Admi Salma, N. Amalita, D. Permana","doi":"10.24036/ujsds/vol2-iss1/124","DOIUrl":"https://doi.org/10.24036/ujsds/vol2-iss1/124","url":null,"abstract":"Poverty is one of the country's problems that the government should overcome. Poverty is influenced by several indicators. The success of a government can be seen from changes in poverty. This study compares the percentage of Indonesia's poverty indicators at the beginning of office (2015) and the end of office (2019) of one government period. The indicators that most affect the poverty rate in 2015 and 2019 are seen using biplot analysis while to measure the similarity and the magnitude of the percentage change in poverty from 2015 to 2019 can use procrustes analysis. The results of the biplot analysis show that households have access to proper sanitation services as the indicator with the highest diversity in 2015 while in 2019 it is the poor young population (15-24 years old) who are not in school and working or attending training and households have access to decent drinking water sources. Riau Islands, DKI Jakarta, DI Yogyakarta, and Bali are the provinces that have the highest values in almost all poverty indicators except the indicator of the percentage of young people (15-24 years old) who are not in school, working or attending training. The results of the procrustean analysis show an increase of 9.7% in Indonesia's poverty indicators in 2019 compared to 2015. So it can be said that the two configurations are very similar.","PeriodicalId":220933,"journal":{"name":"UNP Journal of Statistics and Data Science","volume":"44 4","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140432714","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Penerapan Algoritma Naive Bayes untuk Klasifikasi Demam Berdarah Dengue di RSUD dr. Achmad Darwis","authors":"Viola Yuniza, Atus Amadi Putra, Nonong Amalita, Fadhilah Fitri","doi":"10.24036/ujsds/vol2-iss1/128","DOIUrl":"https://doi.org/10.24036/ujsds/vol2-iss1/128","url":null,"abstract":"Dengue Hemorrhagic Fever (DHF) is a disease transmitted through the bite of the Aedes Aegypti mosquito. Limapuluh Kota Regency BPS stated that the morbidity rate due to dengue fever was 14.40% per 100,000 population, this figure jumped high from the previous year with a morbidity rate of 3.30% per 100,000 population. The main symptoms of dengue fever are fever that lasts for 2-7 days, pain felt in the muscles and joints accompanied by a rash or no rash, dizziness, and even vomiting blood. Dengue infection can cause various clinical symptoms, ranging from dengue fever, dengue hemorrhagic fever, to dengue shock syndrome. Based on this, there is a need for a classification method that can help and facilitate early diagnosis of dengue fever. The method used is Naive Bayes by classifying dengue positive and dengue negative patients. The aim of this research is to determine the results of the classification of patients suffering from dengue fever, as well as to determine the level of accuracy using the Naive Bayes method. Based on research that has been carried out, the results of the classification of patients are 58 correct and 14 patients classified incorrectly. The accuracy results obtained in this algorithm were quite high, namely 80%, while the sensitivity was 65% and the specificity was 86.5%.","PeriodicalId":220933,"journal":{"name":"UNP Journal of Statistics and Data Science","volume":"100 10","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140433595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nindi Syahfitri, N. Amalita, Dodi Vionanda, Zamahsary Martha
{"title":"Forecasting Gold Prices in Indonesia using Support Vector Regression with the Grid Search Algorithm","authors":"Nindi Syahfitri, N. Amalita, Dodi Vionanda, Zamahsary Martha","doi":"10.24036/ujsds/vol2-iss1/145","DOIUrl":"https://doi.org/10.24036/ujsds/vol2-iss1/145","url":null,"abstract":"Investment is an effort to increase economic growth in Indonesia. A popular investment in the community is gold investment. The value of gold investments tends to increase but is not immune from price fluctuations, therefore it is important to forecast the price of gold in Indonesia. The method that can be used to make this forecast is Support Vector Regression (SVR). SVR is a method that looks for a function that has a deviation of no more than ε to get the target value from all training data. The best SVR model with a linear kernel was obtained from a combination of parameters C=0,0625 and ε=0,001 with a RMSE value of 0,19734 and a value of 0,974112. So, the SVR method is appropriate to use for forecasting gold prices in Indonesia.","PeriodicalId":220933,"journal":{"name":"UNP Journal of Statistics and Data Science","volume":"29 13","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140432065","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cindy Caterine, Syafriandi Yolanda, Yenni Kurniawati, Dina Fitria
{"title":"Sentiment Analysis of DANA Application Reviews on Google Play Store Using Naïve Bayes Classifier Algorithm Based on Information Gain","authors":"Cindy Caterine, Syafriandi Yolanda, Yenni Kurniawati, Dina Fitria","doi":"10.24036/ujsds/vol2-iss1/147","DOIUrl":"https://doi.org/10.24036/ujsds/vol2-iss1/147","url":null,"abstract":"DANA is a digital payment platform that provides various features to make it easier for users to make payments, transfers, and balance replenishment online. DANA application users provide a variety of reviews that include both constructive and critical opinions, which can be valuable input for DANA application developers. The purpose of this research is to evaluate the results of sentiment classification of DANA application user reviews on the Google Play Store service using the Naïve Bayes Classifier method and Information Gain feature selection. In addition, this study aims to assess the effect of applying IG feature selection on the performance of the resulting model. In this study, reviews are divided into two categories, namely positive and negative based on lexicon-based labeling. Furthermore, data weighting, feature selection, and data division are carried out with a proportion of 80% train data and 20% test data before model building. There are two models, namely a model without feature selection (NBC model) and a model with feature selection (NBC-IG model). The evaluation results showed that the NBC model with 1106 features performed well, with 82.91% accuracy, 83.96% precision, and 90.23% recall. Meanwhile, the NBC-IG model with 536 features showed higher performance, with 85.09% accuracy, 85.79% precision, and 92.09% recall. The application of IG feature selection with the IG value limit parameter > 0.01 in the NBC model successfully reduced the number of features by 570, and improved model performance with an increase in accuracy by 2.18%, precision by 1.83%, and recall by 1.86%.","PeriodicalId":220933,"journal":{"name":"UNP Journal of Statistics and Data Science","volume":"42 8","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140432739","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sherly Amora Jofipasi, Admi Salma, Dodi Vionanda, Dina Fitria
{"title":"Prediction Of Bogor City Rainfall Parameters Using Long Short Term Memory (LSTM)","authors":"Sherly Amora Jofipasi, Admi Salma, Dodi Vionanda, Dina Fitria","doi":"10.24036/ujsds/vol1-iss5/110","DOIUrl":"https://doi.org/10.24036/ujsds/vol1-iss5/110","url":null,"abstract":"Bogor is a city that has high rainfall and has erratic rainfall. So it is necessary to predict Bogor's rainfall. Rainfall prediction can be done using the LSTM algorith. In the LSTM algorithm, there are hidden neuron parameters and epochs to produce good results, so it is necessary to predict the best parameters in Bogor rainfall. The prediction parameters results obtained by LSTM have worked well using optimal hidden neuron values of 256, optimal epoch of 150, mape of 1,64, and the comparison of actual data patterns and prediction data already has the same data patterns.","PeriodicalId":220933,"journal":{"name":"UNP Journal of Statistics and Data Science","volume":" 17","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139197929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dewi Febiyanti, N. Amalita, D. Permana, Tessy Octavia Mukhti
{"title":"Backpropagation Neural Network Application in Predicting The Stock Price of PT Bank Rakyat Indonesia Tbk","authors":"Dewi Febiyanti, N. Amalita, D. Permana, Tessy Octavia Mukhti","doi":"10.24036/ujsds/vol1-iss5/113","DOIUrl":"https://doi.org/10.24036/ujsds/vol1-iss5/113","url":null,"abstract":"Investors often make mistakes when making stock transactions even though having chosen good company stocks. The thing that needs to be considered in making stock transactions is to see the movement of stock prices. The movement of the stock price in PT Bank Rakyat Indonesia Tbk has changed in the form of a decrease or increase. An increasing stock price will provide benefits for investors by selling stocks. But, investors actually decide to make stock purchases. The existence of stock purchase transactions causes investors to take a high risk because stock prices fluctuate. To anticipate the occurrence of high risk to investor, stock price predictions are made using a Backpropagation Neural Network (BPNN). BPNN can adapt quickly and is able to predict nonlinear data such as stock prices and produce a high level of accuracy. The results of this study obtained the best BPNN model, namely the BP(5,3,1) model with a Mean Absolute Percentage Error (MAPE) of 0,8193%. These results show that the model has good network performance so that it can predict stock prices well because it gets a small prediction error.","PeriodicalId":220933,"journal":{"name":"UNP Journal of Statistics and Data Science","volume":"165 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139201853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Denia Putri Fajrina, Syafriandi, N. Amalita, Admi Salma
{"title":"Sentiment Analysis of TikTok Application on Twitter using The Naïve Bayes Classifier Algorithm","authors":"Denia Putri Fajrina, Syafriandi, N. Amalita, Admi Salma","doi":"10.24036/ujsds/vol1-iss5/103","DOIUrl":"https://doi.org/10.24036/ujsds/vol1-iss5/103","url":null,"abstract":"TikTok is a popular social media platform that has gained a lot of attention lately. People of all ages are using this application to share short videos with their friends and followers. The content on TikTok is diverse and can be tailored to individual preferences, but there have been concerns about the presence of vulgar content that can be viewed by minors as there are no age restrictions. This has led to public scrutiny of the application on social media platforms like Twitter. To address this issue, sentiment analysis was conducted on reviews of the TikTok application to help users make informed decisions about its use. The aim of this analysis was to determine whether people's opinions about TikTok were positive or negative. The results were classified into two categories positive and negative using the Naïve Bayes Classifier method. The analysis was carried out using 80% training data and 20% testing data, and the results showed an accuracy rate of 80.32%, with a recall value of 97% and a precision value of 78%. This information can help users make informed decisions about using the TikTok application.","PeriodicalId":220933,"journal":{"name":"UNP Journal of Statistics and Data Science","volume":" 18","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139207228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wulan Septya Zulmawati, Nonong amalita, Syafriandi Syafriandi, Admi Salma
{"title":"Bitcoin Price Prediction Using Support Vector Regression","authors":"Wulan Septya Zulmawati, Nonong amalita, Syafriandi Syafriandi, Admi Salma","doi":"10.24036/ujsds/vol1-iss5/121","DOIUrl":"https://doi.org/10.24036/ujsds/vol1-iss5/121","url":null,"abstract":"Cryptocurrency provides the most return compared to other investment instruments, causing many novice traders to be attracted to crypto as a tool to make significant profits in the short term. One of the most widely used cryptocurrencies is Bitcoin. Trading is closely related to technical analysis. Various techniques in technical analysis cause beginner traders to have difficulties choosing the right technique. Machine learning methods can be an alternative to overcoming the barriers of beginner traders in the crypto market with predictive methods. One method of machine learning for prediction is Support Vector Regression (SVR). Using the Grid Search algorithm shows that this method has a good predictive accuracy value of 99,25% and MAPE 8,70%.","PeriodicalId":220933,"journal":{"name":"UNP Journal of Statistics and Data Science","volume":"207 ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139202737","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}