StatsPub Date : 2022-11-28DOI: 10.3390/stats5040073
I. Tsolas
{"title":"Assessing Regional Entrepreneurship: A Bootstrapping Approach in Data Envelopment Analysis","authors":"I. Tsolas","doi":"10.3390/stats5040073","DOIUrl":"https://doi.org/10.3390/stats5040073","url":null,"abstract":"The aim of the present paper is to demonstrate the viability of using data envelopment analysis (DEA) in a regional context to evaluate entrepreneurial activities. DEA was used to assess regional entrepreneurship in Greece using individual measures of entrepreneurship as inputs and employment rates as outputs. In addition to point estimates, a bootstrap algorithm was used to produce bias-corrected metrics. In the light of the results of the study, the Greek regions perform differently in terms of converting entrepreneurial activity into job creation. Moreover, there is some evidence that unemployment may be a driver of entrepreneurship and thus negatively affects DEA-based inefficiency. The derived indicators can serve as diagnostic tools and can also be used for the design of various interventions at the regional level.","PeriodicalId":93142,"journal":{"name":"Stats","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44335683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
StatsPub Date : 2022-11-23DOI: 10.3390/stats5040072
P. N. Rathie, L. Ozelim
{"title":"On the Relation between Lambert W-Function and Generalized Hypergeometric Functions","authors":"P. N. Rathie, L. Ozelim","doi":"10.3390/stats5040072","DOIUrl":"https://doi.org/10.3390/stats5040072","url":null,"abstract":"In the theory of special functions, finding correlations between different types of functions is of great interest as unifying results, especially when considering issues such as analytic continuation. In the present paper, the relation between Lambert W-function and generalized hypergeometric functions is discussed. It will be shown that it is possible to link these functions by following two different strategies, namely, by means of the direct and inverse Mellin transform of Lambert W-function and by solving the trinomial equation originally studied by Lambert and Euler. The new results can be used both to numerically evaluate Lambert W-function and to study its analytic structure.","PeriodicalId":93142,"journal":{"name":"Stats","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41910380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
StatsPub Date : 2022-11-18DOI: 10.3390/stats5040071
E. Boone, Jan Hannig, R. Ghanam, Sujit Ghosh, F. Ruggeri, S. Prudhomme
{"title":"Model Validation of a Single Degree-of-Freedom Oscillator: A Case Study","authors":"E. Boone, Jan Hannig, R. Ghanam, Sujit Ghosh, F. Ruggeri, S. Prudhomme","doi":"10.3390/stats5040071","DOIUrl":"https://doi.org/10.3390/stats5040071","url":null,"abstract":"In this paper, we investigate a validation process in order to assess the predictive capabilities of a single degree-of-freedom oscillator. Model validation is understood here as the process of determining the accuracy with which a model can predict observed physical events or important features of the physical system. Therefore, assessment of the model needs to be performed with respect to the conditions under which the model is used in actual simulations of the system and to specific quantities of interest used for decision-making. Model validation also supposes that the model be trained and tested against experimental data. In this work, virtual data are produced from a non-linear single degree-of-freedom oscillator, the so-called oracle model, which is supposed to provide an accurate representation of reality. The mathematical model to be validated is derived from the oracle model by simply neglecting the non-linear term. The model parameters are identified via Bayesian updating. This calibration process also includes a modeling error due to model misspecification and modeled as a normal probability density function with zero mean and standard deviation to be calibrated.","PeriodicalId":93142,"journal":{"name":"Stats","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47929046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
StatsPub Date : 2022-11-17DOI: 10.3390/stats5040069
Elisângela C. Biazatti, G. Cordeiro, Gabriela M. Rodrigues, E. Ortega, L. H. de Santana
{"title":"A Weibull-Beta Prime Distribution to Model COVID-19 Data with the Presence of Covariates and Censored Data","authors":"Elisângela C. Biazatti, G. Cordeiro, Gabriela M. Rodrigues, E. Ortega, L. H. de Santana","doi":"10.3390/stats5040069","DOIUrl":"https://doi.org/10.3390/stats5040069","url":null,"abstract":"Motivated by the recent popularization of the beta prime distribution, a more flexible generalization is presented to fit symmetrical or asymmetrical and bimodal data, and a non-monotonic failure rate. Thus, the Weibull-beta prime distribution is defined, and some of its structural properties are obtained. The parameters are estimated by maximum likelihood, and a new regression model is proposed. Some simulations reveal that the estimators are consistent, and applications to censored COVID-19 data show the adequacy of the models.","PeriodicalId":93142,"journal":{"name":"Stats","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48242201","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
StatsPub Date : 2022-11-17DOI: 10.3390/stats5040070
Kevin D. Dayaratna, Jesse M. Crosson, Chandler Hubbard
{"title":"Closed Form Bayesian Inferences for Binary Logistic Regression with Applications to American Voter Turnout","authors":"Kevin D. Dayaratna, Jesse M. Crosson, Chandler Hubbard","doi":"10.3390/stats5040070","DOIUrl":"https://doi.org/10.3390/stats5040070","url":null,"abstract":"Understanding the factors that influence voter turnout is a fundamentally important question in public policy and political science research. Bayesian logistic regression models are useful for incorporating individual level heterogeneity to answer these and many other questions. When these questions involve incorporating individual level heterogeneity for large data sets that include many demographic and ethnic subgroups, however, standard Markov Chain Monte Carlo (MCMC) sampling methods to estimate such models can be quite slow and impractical to perform in a reasonable amount of time. We present an innovative closed form Empirical Bayesian approach that is significantly faster than MCMC methods, thus enabling the estimation of voter turnout models that had previously been considered computationally infeasible. Our results shed light on factors impacting voter turnout data in the 2000, 2004, and 2008 presidential elections. We conclude with a discussion of these factors and the associated policy implications. We emphasize, however, that although our application is to the social sciences, our approach is fully generalizable to the myriads of other fields involving statistical models with binary dependent variables and high-dimensional parameter spaces as well.","PeriodicalId":93142,"journal":{"name":"Stats","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43746245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
StatsPub Date : 2022-11-16DOI: 10.3390/stats5040068
Juan Borrero, J. Mariscal, Alfonso Vargas-Sánchez
{"title":"A New Predictive Algorithm for Time Series Forecasting Based on Machine Learning Techniques: Evidence for Decision Making in Agriculture and Tourism Sectors","authors":"Juan Borrero, J. Mariscal, Alfonso Vargas-Sánchez","doi":"10.3390/stats5040068","DOIUrl":"https://doi.org/10.3390/stats5040068","url":null,"abstract":"Accurate time series prediction techniques are becoming fundamental to modern decision support systems. As massive data processing develops in its practicality, machine learning (ML) techniques applied to time series can automate and improve prediction models. The radical novelty of this paper is the development of a hybrid model that combines a new approach to the classical Kalman filter with machine learning techniques, i.e., support vector regression (SVR) and nonlinear autoregressive (NAR) neural networks, to improve the performance of existing predictive models. The proposed hybrid model uses, on the one hand, an improved Kalman filter method that eliminates the convergence problems of time series data with large error variance and, on the other hand, an ML algorithm as a correction factor to predict the model error. The results reveal that our hybrid models obtain accurate predictions, substantially reducing the root mean square and absolute mean errors compared to the classical and alternative Kalman filter models and achieving a goodness of fit greater than 0.95. Furthermore, the generalization of this algorithm was confirmed by its validation in two different scenarios.","PeriodicalId":93142,"journal":{"name":"Stats","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47232726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
StatsPub Date : 2022-11-15DOI: 10.3390/stats5040067
Daniele Cuntrera, V. Falco, O. Giambalvo
{"title":"On the Sampling Size for Inverse Sampling","authors":"Daniele Cuntrera, V. Falco, O. Giambalvo","doi":"10.3390/stats5040067","DOIUrl":"https://doi.org/10.3390/stats5040067","url":null,"abstract":"In the Big Data era, sampling remains a central theme. This paper investigates the characteristics of inverse sampling on two different datasets (real and simulated) to determine when big data become too small for inverse sampling to be used and to examine the impact of the sampling rate of the subsamples. We find that the method, using the appropriate subsample size for both the mean and proportion parameters, performs well with a smaller dataset than big data through the simulation study and real-data application. Different settings related to the selection bias severity are considered during the simulation study and real application.","PeriodicalId":93142,"journal":{"name":"Stats","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41829878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
StatsPub Date : 2022-11-10DOI: 10.3390/stats5040066
Sudaraka Tholkage, Qi Zheng, K. B. Kulasekera
{"title":"Conditional Kaplan–Meier Estimator with Functional Covariates for Time-to-Event Data","authors":"Sudaraka Tholkage, Qi Zheng, K. B. Kulasekera","doi":"10.3390/stats5040066","DOIUrl":"https://doi.org/10.3390/stats5040066","url":null,"abstract":"Due to the wide availability of functional data from multiple disciplines, the studies of functional data analysis have become popular in the recent literature. However, the related development in censored survival data has been relatively sparse. In this work, we consider the problem of analyzing time-to-event data in the presence of functional predictors. We develop a conditional generalized Kaplan–Meier (KM) estimator that incorporates functional predictors using kernel weights and rigorously establishes its asymptotic properties. In addition, we propose to select the optimal bandwidth based on a time-dependent Brier score. We then carry out extensive numerical studies to examine the finite sample performance of the proposed functional KM estimator and bandwidth selector. We also illustrated the practical usage of our proposed method by using a data set from Alzheimer’s Disease Neuroimaging Initiative data.","PeriodicalId":93142,"journal":{"name":"Stats","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41563204","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
StatsPub Date : 2022-11-09DOI: 10.3390/stats5040065
D. Griffith
{"title":"Selected Payback Statistical Contributions to Matrix/Linear Algebra: Some Counterflowing Conceptualizations","authors":"D. Griffith","doi":"10.3390/stats5040065","DOIUrl":"https://doi.org/10.3390/stats5040065","url":null,"abstract":"Matrix/linear algebra continues bestowing benefits on theoretical and applied statistics, a practice it began decades ago (re Fisher used the word matrix in a 1941 publication), through a myriad of contributions, from recognition of a suite of matrix properties relevant to statistical concepts, to matrix specifications of linear and nonlinear techniques. Consequently, focused parts of matrix algebra are topics of several statistics books and journal articles. Contributions mostly have been unidirectional, from matrix/linear algebra to statistics. Nevertheless, statistics offers great potential for making this interface a bidirectional exchange point, the theme of this review paper. Not surprisingly, regression, the workhorse of statistics, provides one tool for such historically based recompence. Another prominent one is the mathematical matrix theory eigenfunction abstraction. A third is special matrix operations, such as Kronecker sums and products. A fourth is multivariable calculus linkages, especially arcane matrix/vector operators as well as the Jacobian term associated with variable transformations. A fifth, and the final idea this paper treats, is random matrices/vectors within the context of simulation, particularly for correlated data. These are the five prospectively reviewed discipline of statistics subjects capable of informing, inspiring, or otherwise furnishing insight to the far more general world of linear algebra.","PeriodicalId":93142,"journal":{"name":"Stats","volume":"147 9","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41311753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
StatsPub Date : 2022-11-05DOI: 10.3390/stats5040064
Tzong-Ru Tsai, Hua Xin, Yanqin Fan, Y. Lio
{"title":"Bias-Corrected Maximum Likelihood Estimation and Bayesian Inference for the Process Performance Index Using Inverse Gaussian Distribution","authors":"Tzong-Ru Tsai, Hua Xin, Yanqin Fan, Y. Lio","doi":"10.3390/stats5040064","DOIUrl":"https://doi.org/10.3390/stats5040064","url":null,"abstract":"In this study, the estimation methods of bias-corrected maximum likelihood (BCML), bootstrap BCML (B-BCML) and Bayesian using Jeffrey’s prior distribution were proposed for the inverse Gaussian distribution with small sample cases to obtain the ML and Bayes estimators of the model parameters and the process performance index based on the lower specification process performance index. Moreover, an approximate confidence interval and the highest posterior density interval of the process performance index were established via the delta and Bayesian inference methods, respectively. To overcome the computational difficulty of sampling from the posterior distribution in Bayesian inference, the Markov chain Monte Carlo approach was used to implement the proposed Bayesian inference procedures. Monte Carlo simulations were conducted to evaluate the performance of the proposed BCML, B-BCML and Bayesian estimation methods. An example of the active repair times for an airborne communication transceiver is used for illustration.","PeriodicalId":93142,"journal":{"name":"Stats","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43296690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}