{"title":"A Comparative Assessment of Frequentist Forecasting Models: Evidence from the S&P 500 Pharmaceuticals Index","authors":"C. Muneza, Asad M. Khan, Waqar Badshah","doi":"10.26650/joda.1312382","DOIUrl":"https://doi.org/10.26650/joda.1312382","url":null,"abstract":"This paper compares three forecasting methods, the autoregressive integrated moving average (ARIMA), generalized autoregressive conditional heteroscedasticity (GARCH), and neural network autoregression (NNAR) methods, using the S&P 500 Pharmaceuticals Index. The objective is to identify the most accurate model based on the mean average forecasting error (MAFE). The results consistently show the NNAR model to outperform ARIMA and GARCH and to exhibit a significantly lower MAFE. The existing literature presents conflicting findings on forecasting model accuracy for stock indexes. While studies have explored various models, no universally applicable model exists. Therefore, a comparative analysis is crucial. The methodology includes data collection and cleaning, exploratory analysis, and model building. The daily closing prices of pharmaceutical stocks from the S&P 500 serve as the dataset. The exploratory analysis reveals an upward trend and increasing heteroscedasticity in the pharmaceuticals index, with the unit root tests confirming non-stationarity. To address this, the dataset has been transformed into stationary returns using logarithmic and differencing techniques. Model building involves splitting the dataset into training and test sets. The training set determines the best-fit models for each method. The models are then compared using MAFE on the test set, with the model possessing the lowest MAFE being considered the best. The findings provide insights into model accuracy for pharmaceutical industry indexes, aiding investor predictions, with the comparative analysis emphasizing tailored forecasting models for specific indexes and datasets.","PeriodicalId":250029,"journal":{"name":"Journal of Data Applications","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133777226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Application with Python Software for the Classification of Chemical Data","authors":"Gonca Ertürk, O. Akpolat","doi":"10.26650/joda.1264915","DOIUrl":"https://doi.org/10.26650/joda.1264915","url":null,"abstract":"Nowadays, much data is produced in analytical devices in the field of chemistry and can be stored digitally. By evaluating these data, it is possible to decipher the relationships between them and to make predictions for the new data measured using these relationships with the help of data mining algorithms. One of the areas of chemistry where a lot of data are produced is the environment. Most of the pollution in wastewater consists of detergents, organic substances, and oils. The main processes in wastewater treatment are to destroy (1) biodegradable organic matter, (2) suspended solids, (3) harmful heavy metals and toxic compounds, (4) nitrogen and phosphorus depending on the ambient conditions, and (5) pathogenic organisms. Monitoring the wastewater treatment processes and providing the necessary controls bases on the continuous determination of the wastewater and activated sludge characteristics. The basic measurement criteria for determining the properties of wastewater are the amounts of biochemical oxygen demand (BOD5), chemical oxygen demand (COD), total organic carbon (TOC) and dissolved oxygen (DO). Among these parameters, BOD5 measurement takes at least 5 days, while others can be measured in 1-2 hours max. If BOD5 values could be mathematically associated with the other parameters, it would provide a great advantage in terms of controlling the estimated process depending on them in a shorter time. In the study conducted within this framework, a set of data was created by measuring the above-mentioned parameters from 334 samples taken from a treatment plant for statistical evaluation, and the interactions of the parameters in this data set with each other were examined by a decision tree method. Thus, this study tries to based on estimate the weight of the parameters on the BOD5 value of the samples. The data mining algorithm selected for this modelling was written with Python software and the performance of the algorithm was examined in estimating the BOD5 parameter depending on other parameters by extracting the decision tree rules.","PeriodicalId":250029,"journal":{"name":"Journal of Data Applications","volume":"185 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123551931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Ethical Dimension of Artificial Intelligence","authors":"Gökçe Karahan Adalı","doi":"10.26650/joda.1253475","DOIUrl":"https://doi.org/10.26650/joda.1253475","url":null,"abstract":"","PeriodicalId":250029,"journal":{"name":"Journal of Data Applications","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123398113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Modeling Automobile Sales in Turkiye with Regression-Based Machine Learning Algorithms","authors":"Merve Babaoglu, Ahmet Coşkunçay, Tolga Aydin","doi":"10.26650/joda.1242645","DOIUrl":"https://doi.org/10.26650/joda.1242645","url":null,"abstract":"The automobile sector is the locomotive of industrialized countries. The employment opportunities it creates are of great value because of its interconnectedness with other industries and the value it adds. Demand forecasting studies in such an important sector are one of the main drivers for the provision of raw materials and services needed in the future. In this study, 10 independent variables are used that directly or indirectly affect the level of car sales, which is our dependent variable. These variables are gross domestic product, real sector confidence index, capital expenditures, household consumption expenditures, inflation rate, consumer confidence index, percentage of one-year term deposits, and oil barrel, gold, and dollar prices. The dataset used consists of annual data between 2000 and 2021. To examine the sales forecast model, two variables that affect minimum sales are first extracted from the model using the least squares method. Linear Regression, Decision Tree, Random Forest, Ridge, AdaBoost, Elastic-net, and Lasso Regression algorithms are applied to build a predictive model with these variables. The Mean Squared Error (MSE), Mean Absolute Error (MAE), and coefficient of determination (R 2 ) are used to compare the performance of the predictive models. This study proposes an approach for sectors affected directly or indirectly by automotive sales to gain foresight on this issue.","PeriodicalId":250029,"journal":{"name":"Journal of Data Applications","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126755903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Istanbul’s Community Mobility Changes During the COVID-19 Pandemic: A Spatial Analysis","authors":"A. Arik, G. Cavdaroglu","doi":"10.26650/joda.1215566","DOIUrl":"https://doi.org/10.26650/joda.1215566","url":null,"abstract":"COVID-19 was the most recent pandemic to strike humanity. Moreover, this pandemic occurred during the most active period of global interaction and mobility, unlike pandemics like cholera, plague, and flu in earlier centuries. Many countries restricted domestic mobility after suspending international mobility to prevent the pandemic from spreading. Although these policies differ from nation to nation, they have affected the mobility of communities. This study examined spatial and non-spatial independent variables that affected how the community’s mobility patterns changed in various locations, including parks, transit stations, workplaces, grocery and pharmacies, and residential areas in Istanbul, Türkiye. The impact of the independent spatial variables on the mobility changes was examined after identifying the non-spatial independent variables influencing the mobility changes in 6 different areas. It was determined that the altitude variable, expected to impact how mobility changed, had no overall impact on the dependent variable. On the other hand, the dependent variables representing the mobility changes were affected by the independent variables representing the county center’s latitude and longitude values and whether the county is located near the sea. Regression analysis across Türkiye will be performed in upcoming studies using an updated version of the methodology used in this study.","PeriodicalId":250029,"journal":{"name":"Journal of Data Applications","volume":"80 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125450125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}