StatsPub Date : 2022-11-18DOI: 10.3390/stats5040071
E. Boone, Jan Hannig, R. Ghanam, Sujit Ghosh, F. Ruggeri, S. Prudhomme
{"title":"Model Validation of a Single Degree-of-Freedom Oscillator: A Case Study","authors":"E. Boone, Jan Hannig, R. Ghanam, Sujit Ghosh, F. Ruggeri, S. Prudhomme","doi":"10.3390/stats5040071","DOIUrl":"https://doi.org/10.3390/stats5040071","url":null,"abstract":"In this paper, we investigate a validation process in order to assess the predictive capabilities of a single degree-of-freedom oscillator. Model validation is understood here as the process of determining the accuracy with which a model can predict observed physical events or important features of the physical system. Therefore, assessment of the model needs to be performed with respect to the conditions under which the model is used in actual simulations of the system and to specific quantities of interest used for decision-making. Model validation also supposes that the model be trained and tested against experimental data. In this work, virtual data are produced from a non-linear single degree-of-freedom oscillator, the so-called oracle model, which is supposed to provide an accurate representation of reality. The mathematical model to be validated is derived from the oracle model by simply neglecting the non-linear term. The model parameters are identified via Bayesian updating. This calibration process also includes a modeling error due to model misspecification and modeled as a normal probability density function with zero mean and standard deviation to be calibrated.","PeriodicalId":93142,"journal":{"name":"Stats","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47929046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
StatsPub Date : 2022-11-17DOI: 10.3390/stats5040069
Elisângela C. Biazatti, G. Cordeiro, Gabriela M. Rodrigues, E. Ortega, L. H. de Santana
{"title":"A Weibull-Beta Prime Distribution to Model COVID-19 Data with the Presence of Covariates and Censored Data","authors":"Elisângela C. Biazatti, G. Cordeiro, Gabriela M. Rodrigues, E. Ortega, L. H. de Santana","doi":"10.3390/stats5040069","DOIUrl":"https://doi.org/10.3390/stats5040069","url":null,"abstract":"Motivated by the recent popularization of the beta prime distribution, a more flexible generalization is presented to fit symmetrical or asymmetrical and bimodal data, and a non-monotonic failure rate. Thus, the Weibull-beta prime distribution is defined, and some of its structural properties are obtained. The parameters are estimated by maximum likelihood, and a new regression model is proposed. Some simulations reveal that the estimators are consistent, and applications to censored COVID-19 data show the adequacy of the models.","PeriodicalId":93142,"journal":{"name":"Stats","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48242201","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
StatsPub Date : 2022-11-17DOI: 10.3390/stats5040070
Kevin D. Dayaratna, Jesse M. Crosson, Chandler Hubbard
{"title":"Closed Form Bayesian Inferences for Binary Logistic Regression with Applications to American Voter Turnout","authors":"Kevin D. Dayaratna, Jesse M. Crosson, Chandler Hubbard","doi":"10.3390/stats5040070","DOIUrl":"https://doi.org/10.3390/stats5040070","url":null,"abstract":"Understanding the factors that influence voter turnout is a fundamentally important question in public policy and political science research. Bayesian logistic regression models are useful for incorporating individual level heterogeneity to answer these and many other questions. When these questions involve incorporating individual level heterogeneity for large data sets that include many demographic and ethnic subgroups, however, standard Markov Chain Monte Carlo (MCMC) sampling methods to estimate such models can be quite slow and impractical to perform in a reasonable amount of time. We present an innovative closed form Empirical Bayesian approach that is significantly faster than MCMC methods, thus enabling the estimation of voter turnout models that had previously been considered computationally infeasible. Our results shed light on factors impacting voter turnout data in the 2000, 2004, and 2008 presidential elections. We conclude with a discussion of these factors and the associated policy implications. We emphasize, however, that although our application is to the social sciences, our approach is fully generalizable to the myriads of other fields involving statistical models with binary dependent variables and high-dimensional parameter spaces as well.","PeriodicalId":93142,"journal":{"name":"Stats","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43746245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
StatsPub Date : 2022-11-16DOI: 10.3390/stats5040068
Juan Borrero, J. Mariscal, Alfonso Vargas-Sánchez
{"title":"A New Predictive Algorithm for Time Series Forecasting Based on Machine Learning Techniques: Evidence for Decision Making in Agriculture and Tourism Sectors","authors":"Juan Borrero, J. Mariscal, Alfonso Vargas-Sánchez","doi":"10.3390/stats5040068","DOIUrl":"https://doi.org/10.3390/stats5040068","url":null,"abstract":"Accurate time series prediction techniques are becoming fundamental to modern decision support systems. As massive data processing develops in its practicality, machine learning (ML) techniques applied to time series can automate and improve prediction models. The radical novelty of this paper is the development of a hybrid model that combines a new approach to the classical Kalman filter with machine learning techniques, i.e., support vector regression (SVR) and nonlinear autoregressive (NAR) neural networks, to improve the performance of existing predictive models. The proposed hybrid model uses, on the one hand, an improved Kalman filter method that eliminates the convergence problems of time series data with large error variance and, on the other hand, an ML algorithm as a correction factor to predict the model error. The results reveal that our hybrid models obtain accurate predictions, substantially reducing the root mean square and absolute mean errors compared to the classical and alternative Kalman filter models and achieving a goodness of fit greater than 0.95. Furthermore, the generalization of this algorithm was confirmed by its validation in two different scenarios.","PeriodicalId":93142,"journal":{"name":"Stats","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47232726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
StatsPub Date : 2022-11-15DOI: 10.3390/stats5040067
Daniele Cuntrera, V. Falco, O. Giambalvo
{"title":"On the Sampling Size for Inverse Sampling","authors":"Daniele Cuntrera, V. Falco, O. Giambalvo","doi":"10.3390/stats5040067","DOIUrl":"https://doi.org/10.3390/stats5040067","url":null,"abstract":"In the Big Data era, sampling remains a central theme. This paper investigates the characteristics of inverse sampling on two different datasets (real and simulated) to determine when big data become too small for inverse sampling to be used and to examine the impact of the sampling rate of the subsamples. We find that the method, using the appropriate subsample size for both the mean and proportion parameters, performs well with a smaller dataset than big data through the simulation study and real-data application. Different settings related to the selection bias severity are considered during the simulation study and real application.","PeriodicalId":93142,"journal":{"name":"Stats","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41829878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
StatsPub Date : 2022-11-10DOI: 10.3390/stats5040066
Sudaraka Tholkage, Qi Zheng, K. B. Kulasekera
{"title":"Conditional Kaplan–Meier Estimator with Functional Covariates for Time-to-Event Data","authors":"Sudaraka Tholkage, Qi Zheng, K. B. Kulasekera","doi":"10.3390/stats5040066","DOIUrl":"https://doi.org/10.3390/stats5040066","url":null,"abstract":"Due to the wide availability of functional data from multiple disciplines, the studies of functional data analysis have become popular in the recent literature. However, the related development in censored survival data has been relatively sparse. In this work, we consider the problem of analyzing time-to-event data in the presence of functional predictors. We develop a conditional generalized Kaplan–Meier (KM) estimator that incorporates functional predictors using kernel weights and rigorously establishes its asymptotic properties. In addition, we propose to select the optimal bandwidth based on a time-dependent Brier score. We then carry out extensive numerical studies to examine the finite sample performance of the proposed functional KM estimator and bandwidth selector. We also illustrated the practical usage of our proposed method by using a data set from Alzheimer’s Disease Neuroimaging Initiative data.","PeriodicalId":93142,"journal":{"name":"Stats","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41563204","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
StatsPub Date : 2022-11-09DOI: 10.3390/stats5040065
D. Griffith
{"title":"Selected Payback Statistical Contributions to Matrix/Linear Algebra: Some Counterflowing Conceptualizations","authors":"D. Griffith","doi":"10.3390/stats5040065","DOIUrl":"https://doi.org/10.3390/stats5040065","url":null,"abstract":"Matrix/linear algebra continues bestowing benefits on theoretical and applied statistics, a practice it began decades ago (re Fisher used the word matrix in a 1941 publication), through a myriad of contributions, from recognition of a suite of matrix properties relevant to statistical concepts, to matrix specifications of linear and nonlinear techniques. Consequently, focused parts of matrix algebra are topics of several statistics books and journal articles. Contributions mostly have been unidirectional, from matrix/linear algebra to statistics. Nevertheless, statistics offers great potential for making this interface a bidirectional exchange point, the theme of this review paper. Not surprisingly, regression, the workhorse of statistics, provides one tool for such historically based recompence. Another prominent one is the mathematical matrix theory eigenfunction abstraction. A third is special matrix operations, such as Kronecker sums and products. A fourth is multivariable calculus linkages, especially arcane matrix/vector operators as well as the Jacobian term associated with variable transformations. A fifth, and the final idea this paper treats, is random matrices/vectors within the context of simulation, particularly for correlated data. These are the five prospectively reviewed discipline of statistics subjects capable of informing, inspiring, or otherwise furnishing insight to the far more general world of linear algebra.","PeriodicalId":93142,"journal":{"name":"Stats","volume":"147 9","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41311753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
StatsPub Date : 2022-11-05DOI: 10.3390/stats5040064
Tzong-Ru Tsai, Hua Xin, Yanqin Fan, Y. Lio
{"title":"Bias-Corrected Maximum Likelihood Estimation and Bayesian Inference for the Process Performance Index Using Inverse Gaussian Distribution","authors":"Tzong-Ru Tsai, Hua Xin, Yanqin Fan, Y. Lio","doi":"10.3390/stats5040064","DOIUrl":"https://doi.org/10.3390/stats5040064","url":null,"abstract":"In this study, the estimation methods of bias-corrected maximum likelihood (BCML), bootstrap BCML (B-BCML) and Bayesian using Jeffrey’s prior distribution were proposed for the inverse Gaussian distribution with small sample cases to obtain the ML and Bayes estimators of the model parameters and the process performance index based on the lower specification process performance index. Moreover, an approximate confidence interval and the highest posterior density interval of the process performance index were established via the delta and Bayesian inference methods, respectively. To overcome the computational difficulty of sampling from the posterior distribution in Bayesian inference, the Markov chain Monte Carlo approach was used to implement the proposed Bayesian inference procedures. Monte Carlo simulations were conducted to evaluate the performance of the proposed BCML, B-BCML and Bayesian estimation methods. An example of the active repair times for an airborne communication transceiver is used for illustration.","PeriodicalId":93142,"journal":{"name":"Stats","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43296690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
StatsPub Date : 2022-11-01DOI: 10.3390/stats5040063
Paolo Onorati, B. Liseo
{"title":"Bayesian Hierarchical Copula Models with a Dirichlet–Laplace Prior","authors":"Paolo Onorati, B. Liseo","doi":"10.3390/stats5040063","DOIUrl":"https://doi.org/10.3390/stats5040063","url":null,"abstract":"We discuss a Bayesian hierarchical copula model for clusters of financial time series. A similar approach has been developed in recent paper. However, the prior distributions proposed there do not always provide a proper posterior. In order to circumvent the problem, we adopt a proper global–local shrinkage prior, which is also able to account for potential dependence structures among different clusters. The performance of the proposed model is presented via simulations and a real data analysis.","PeriodicalId":93142,"journal":{"name":"Stats","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47398319","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
StatsPub Date : 2022-10-31DOI: 10.3390/stats5040062
Vijay Kumar
{"title":"Product Recalls in European Textile and Clothing Sector—A Macro Analysis of Risks and Geographical Patterns","authors":"Vijay Kumar","doi":"10.3390/stats5040062","DOIUrl":"https://doi.org/10.3390/stats5040062","url":null,"abstract":"Textile and clothing (T&C) products contribute to a substantial proportion of the nonfood product recalls in the European Union (EU) due to various levels of associated risks. Out of the listed 34 categories for product recalls in the EU’s Rapid Exchange of Information System (RAPEX), the category ‘clothing, textiles, and fashion items’ was among the top 3 categories with the most recall cases during 2013–2019. Previous studies have attempted to highlight the issue of product recalls and their impacts from the perspective of a single company or selected companies, whereas limited attention is paid to understand the problem from a sector-specific perspective. However, considering the nature of product risks and the consistency in a higher number of recall cases, it is important to analyze the issue of product recalls in the T&C sector from a sector-specific perspective. In this context, the paper focuses on investigating the past recalls in the T&C sector reported RAPEX during 2005–2021 to understand the major trends in recall occurrence and associated hazards. Correspondence Analysis (CA) and Latent Dirichlet Allocation (LDA) were applied to analyze the qualitative and quantitative recall data. The results reveal that there is a geographical pattern for the product risk that leads to the recalls. The countries in eastern part of Europe tend to have proportionately high recalls in strangulation and choking-related issues, whereas chemical-related recalls are proportionately high in countries located in western part of Europe. Further, text-mining results indicate that design-related recall issues are more prevalent in children’s clothing.","PeriodicalId":93142,"journal":{"name":"Stats","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49282568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}