{"title":"Empirical Performance of CART, C5.0 and Random Forest Classification Algorithms for Decision Trees","authors":"Bissilimou Racidatou Orounla, Akoeugnigan Idelphonse Sode, Kolawole Valère Salako, Romain Glèlè Kakaï","doi":"10.16929/ajas/2023.1399.274","DOIUrl":"https://doi.org/10.16929/ajas/2023.1399.274","url":null,"abstract":"This study compares the performance of <i>CART</i>, <i>C5.0</i> and Random Forest (<i>RF</i>) algorithms. 25 continuous predictors and 25 factors were simulated using a population size of 10,000. Based on this data, sample data were generated by varying the number of predictors, the proportion of categorical versus continuous predictors and the sample size. The performance of the tree algorithms increases with sample size and the number of variables, but for <i>RF</i>, it is highly greater than the one of <i>CART</i> and <i>C5.0</i>. Irrespective of the algorithms, the performance decreases when there are more categorical variables than continuous variables.","PeriodicalId":332314,"journal":{"name":"African Journal of Applied Statistics","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135007870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Walter Omonywa Onchere, Calvin Bitange Maina, Fred Nyamitago Monari
{"title":"Gumbel copula mortality dependence modeling","authors":"Walter Omonywa Onchere, Calvin Bitange Maina, Fred Nyamitago Monari","doi":"10.16929/ajas/2023.1383.273","DOIUrl":"https://doi.org/10.16929/ajas/2023.1383.273","url":null,"abstract":"Using joint-life last-survivor annuities data, we conduct an analysis of the joint lifetime dependence. In the current paper, we apply the Gumbel copula and compare it to the Clayton copula approaches to address dependence effects. The method of moments procedure is used to calibrate the copula dependence parameter and maximum likelihood estimation for the marginal specifications. Subsequently, the performance of the marginals is compared following the criteria values. The findings show that the Gumbel copula with logistic marginals appropriately accounts for the dependence effects. These research findings have significant implications for the valuation of joint-life policies to avoid pricing error","PeriodicalId":332314,"journal":{"name":"African Journal of Applied Statistics","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135006828","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Romuald Beh Mba, Bruno Enagnon Lokonon, Romain Glèlè Kakaï
{"title":"Quality report of infectious disease modeling techniques for point-referenced spatial data: A Systematic review","authors":"Romuald Beh Mba, Bruno Enagnon Lokonon, Romain Glèlè Kakaï","doi":"10.16929/ajas/2023.1368.272","DOIUrl":"https://doi.org/10.16929/ajas/2023.1368.272","url":null,"abstract":"Spatial data modeling can provide significant value to healthcare organizations by improving decision support, resource management and distribution, and clinical outcomes. The aim of this study was to (i) summarize the trends of the modeling techniques used to analyze point-referenced spatial data in epidemiology and (ii) examine if all information required when applying these modeling techniques were properly reported in the published papers. A literature search was limited to journal papers published from January 2010 to June 2022 using PubMed, Scopus, Crossref, and Google Scholar. From 528 articles identified with the defined keywords, 351 were retained for the review. The results revealed that the use of modeling techniques in spatial data for infectious diseases increases exponentially over time. The most common spatial method was Empirical Bayesian Kriging [EBK] (52% of the selected articles), followed by Spatial GLMMs (34%) and Spatial smoothing Kernel Estimation (13%).","PeriodicalId":332314,"journal":{"name":"African Journal of Applied Statistics","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135007878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Empirical assessment of the physico-chemical determinants of soil spatial variability in Sub-Saharan Africa","authors":"C. Agbangba, E. E. Gongnet, R. G. Kakaï","doi":"10.16929/ajas/2022.1319.270","DOIUrl":"https://doi.org/10.16929/ajas/2022.1319.270","url":null,"abstract":"An appropriate understanding of soil properties’ spatial variability could help to perform sustainable soil nutrient management. The study aims to identify the most important soil characteristics driving spatial variability in tropical soil. A total of 5000 sample locations were randomly generated from the Sub-Saharan Africa map and the sample values were obtained from www.soilgrid.org. Various variogram models were tested and the best fitted variogram parameters were used to simulate 10000 replications of each attributes and the spatial dependence indices were computed. Results suggested that soil N, pH and organic carbon are the most driving spatial variability to better control experimental error.","PeriodicalId":332314,"journal":{"name":"African Journal of Applied Statistics","volume":"199 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121281513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gorgui Gning, Aladji Babacar Niang, Soumaila Dembele, G. Lo
{"title":"A complete computer-based approach for data generation patterning to a pdf in (mathbb{R}) and application to gamma and gig data","authors":"Gorgui Gning, Aladji Babacar Niang, Soumaila Dembele, G. Lo","doi":"10.16929/ajas/2022.1331.271","DOIUrl":"https://doi.org/10.16929/ajas/2022.1331.271","url":null,"abstract":"Here, we present an automatic data generation method which is fully computer-based for a variate $X$ with an absolutely continuous probability density function (pdf) $f$ exactly computable. The method uses computer-based on calculations of integrals (trapezoidal and/or the Monte-Carlo method) for approximating the cumulative distribution function and next, the dichotomy algorithm to get the quantile function from which we obtain data from (f). We apply the method to generate gig(a,b,c) data. The comparison with analogues, as in textbf{R} Software is very successful. The method may work where the rejection method fails because of a lack of textit{pdf} bound which can be generated. The method might be slower but the area of more and more powerful computer is favorable to it. The implementation for gamma<i> and/or gig<i> laws in R codes are presented.","PeriodicalId":332314,"journal":{"name":"African Journal of Applied Statistics","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122153612","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
E. N. Nortey, K. Asah-Asante, R. Minkah, Edmund Fosu-Agyemang
{"title":"Bayesian Estimation of Presidential Elections in Ghana: A Validation Approach","authors":"E. N. Nortey, K. Asah-Asante, R. Minkah, Edmund Fosu-Agyemang","doi":"10.16929/ajas/2022.1297.269","DOIUrl":"https://doi.org/10.16929/ajas/2022.1297.269","url":null,"abstract":"Elections are one of the barometers through which electorates measure the performance of governments and decide whether to renew their mandate or not. The success of every election goes a long way to strengthen the frontiers of a country's democracy and provide legitimacy for those who hold political power. However, the electoral process of many African countries has been challenged in courts or allegations of fraud and vote rigging are leveled against the winning party or candidate. Therefore, there is the need for a statistical method for checking and validating election results to ascertain fraud and vote rigging claims. Existing validation methods include the Parallel Vote Tabulation methodology. However, some significant disadvantages of this approach are issues of cost, sampling techniques and sample size determination. To overcome these, this study resorts to using the Dirichlet multinomial Bayesian model to compute posterior probabilities of valid votes cast and Bayesian credible intervals to ascertain the legitimacy of the votes cast. Using the Ghana general elections in 2020, the fitted Bayesian model accurately predicted approximately 99% of the proportion of votes obtained by New Patriotic Party, National Democratic Congress and all Other Political Parties. Also, the valid votes received by all the political parties fall within the Bayesian credible intervals indicating that the credibility of the 2020 presidential elections held in Ghana may not be in doubt.","PeriodicalId":332314,"journal":{"name":"African Journal of Applied Statistics","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129781840","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Imputation methods for missing values: the case of Senegalese meteorological data","authors":"Sémou di, E. Deme, A. Deme","doi":"10.16929/ajas/2022.1245.267","DOIUrl":"https://doi.org/10.16929/ajas/2022.1245.267","url":null,"abstract":"nge studies require comprehensive databases to analyze the climate signal, to monitor its evolution, and to predict more accurately future changes. Since complete observations of any continuous process is almost impossible, it is then inevitable to encounter missing information in meteorological databases. The aim of this work is to evaluate the performance of five ($5$) imputation methods: missForest, $k$-nn, ppca, mice and imputeTS. The results show that missForest is the best performing method to handle missing temperature data. In the case of precipitation data, the imputeTS method is the preferred one.","PeriodicalId":332314,"journal":{"name":"African Journal of Applied Statistics","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125932419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
E. E. Gongnet, C. Agbangba, Tranquillin Sédjro Affossogbe, R. G. Kakaï
{"title":"Spatial prediction of soil organic matter in Adingnigon (Benin) using Bayesian Maximum Entropy (BME)","authors":"E. E. Gongnet, C. Agbangba, Tranquillin Sédjro Affossogbe, R. G. Kakaï","doi":"10.16929/ajas/2022.1279.268","DOIUrl":"https://doi.org/10.16929/ajas/2022.1279.268","url":null,"abstract":"Demographic pressure and climate change have heavily affected soil fertility. Proper soil management requires the understanding of the spatial variation of soil properties. In this study, Bayesian maximum Entropy (BME) was used to explore the variation of soil pH and soil organic matter (SOM) at Adingningon (Benin) using 106 soil samples. The predicting maps indicated a lower concentration (0.6 to 0.8g/kg) of SOM toward the center and pH mostly around 5.8 to 6.5 with lower error variance, suggesting an acidic soil. This results provide useful information for managing soil fertility to improve crop yields.","PeriodicalId":332314,"journal":{"name":"African Journal of Applied Statistics","volume":"108 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132911927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Cycling in a Variance Exchange Algorithm: Its Influence and Remedy","authors":"Okim I. Ikpan, F. Nwobi","doi":"10.16929/ajas/2021.1227.267","DOIUrl":"https://doi.org/10.16929/ajas/2021.1227.267","url":null,"abstract":"This paper introduces cycling in a variance exchange algorithm, a sequential search procedure for the construction of exact $D-$optimal designs done over a list of (tilde{X}) candidate points and involves the iterative improvement of an initial $N-$trial design. Cycling occurs in this sequence at a certain step of the exchange when a point that was earlier removed from the design at the k-th step qualifies to return to the design at the (k+1)-th point with determinant of the information matrix equal to that of the k-th step or even that of the (k-1)-th step and therefore not guaranteeing the N-point exact D-optimal design. A method to overcome cycling is finally proposed","PeriodicalId":332314,"journal":{"name":"African Journal of Applied Statistics","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122607880","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimal model for weaning-weight of Bunaji Bulls at NAPRI Farm, Shika, Nigeria","authors":"A. Rabe","doi":"10.16929/ajas/2021.1199.265","DOIUrl":"https://doi.org/10.16929/ajas/2021.1199.265","url":null,"abstract":"Empirical models have over the years been commonly established by animal research centers for the study of weight-age profiles in order to understand the metabolic processes of growth. They provide efficient parameter estimates for mature weight and rate of maturing, but were found to consistently either over-or-under estimate the mature weight. estimate the mature weight. They also perform poorly in predicting weight in early life or beyond the range of input data. At the National Animal Production Research Institute (NAPRI) farm, Shika, Brody was established as the model that provides efficient parameter estimates of weight-age profiles for Bunaji bulls. However, a major drawback of the model is its consistent underestimation of weight prior to six months of age, leading to poor prediction of weaning weight. To address this shortcoming, we propose in this article a joint mean-covariance model that provide optimal parameter estimates for the weaning weight of Bunaji bulls","PeriodicalId":332314,"journal":{"name":"African Journal of Applied Statistics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130560879","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}