StatsPub Date : 2023-08-29DOI: 10.3390/stats6030056
Fanny Rancourt, Paula Vondrlik, Diego Maupomé, Marie-Jean Meurs
{"title":"Investigating Self-Rationalizing Models for Commonsense Reasoning","authors":"Fanny Rancourt, Paula Vondrlik, Diego Maupomé, Marie-Jean Meurs","doi":"10.3390/stats6030056","DOIUrl":"https://doi.org/10.3390/stats6030056","url":null,"abstract":"The rise of explainable natural language processing spurred a bulk of work on datasets augmented with human explanations, as well as technical approaches to leverage them. Notably, generative large language models offer new possibilities, as they can output a prediction as well as an explanation in natural language. This work investigates the capabilities of fine-tuned text-to-text transfer Transformer (T5) models for commonsense reasoning and explanation generation. Our experiments suggest that while self-rationalizing models achieve interesting results, a significant gap remains: classifiers consistently outperformed self-rationalizing models, and a substantial fraction of model-generated explanations are not valid. Furthermore, training with expressive free-text explanations substantially altered the inner representation of the model, suggesting that they supplied additional information and may bridge the knowledge gap. Our code is publicly available, and the experiments were run on open-access datasets, hence allowing full reproducibility.","PeriodicalId":93142,"journal":{"name":"Stats","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47029258","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
StatsPub Date : 2023-08-25DOI: 10.3390/stats6030055
S. Lipovetsky
{"title":"Statistical Modeling of Implicit Functional Relations","authors":"S. Lipovetsky","doi":"10.3390/stats6030055","DOIUrl":"https://doi.org/10.3390/stats6030055","url":null,"abstract":"This study considers the statistical estimation of relations presented by implicit functions. Such structures define mutual interconnections of variables rather than outcome variable dependence by predictor variables considered in regular regression analysis. For a simple case of two variables, pairwise regression modeling produces two different lines of each variable dependence using another variable, but building an implicit relation yields one invertible model composed of two simple regressions. Modeling an implicit linear relation for multiple variables can be expressed as a generalized eigenproblem of the covariance matrix of the variables in the metric of the covariance matrix of their errors. For unknown errors, this work describes their estimation by the residual errors of each variable in its regression by the other predictors. Then, the generalized eigenproblem can be reduced to the diagonalization of a special matrix built from the variables’ covariance matrix and its inversion. Numerical examples demonstrate the eigenvector solution’s good properties for building a unique equation of the relations between all variables. The proposed approach can be useful in practical regression modeling with all variables containing unobserved errors, which is a common situation for the applied problems.","PeriodicalId":93142,"journal":{"name":"Stats","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42159354","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
StatsPub Date : 2023-08-15DOI: 10.3390/stats6030054
Helder Jose Celani de Souza, V. Salomon, Carlos Eduardo Sanches da Silva
{"title":"Statistical Predictors of Project Management Maturity","authors":"Helder Jose Celani de Souza, V. Salomon, Carlos Eduardo Sanches da Silva","doi":"10.3390/stats6030054","DOIUrl":"https://doi.org/10.3390/stats6030054","url":null,"abstract":"Global scenarios of organizations show investments wasted in projects with poor performances in more than 11 percent of cases, according to the Project Management Institute. This research aims to guide organizations in assertively investing in the right pertinent factors to improve project success rates and speed up project management maturity at a higher accuracy level using statistical predictions. Challenging existing drivers for project management maturity models and expanding their current practical view will be the result of a quantitative methodology based on a survey supported by data collection targeting the project management community in Brazil. The originality and value of this research are in contributing to the development of new project maturity models statistically supported by the increasing rate of maturity accuracy, which can be continually improved by confident data input into the model. The results show a high correlation between the performance measurement system and the project success rate associated with project management maturity. In addition, this research contemplates the relationship between organizational culture, business type, and project management office and project management maturity.","PeriodicalId":93142,"journal":{"name":"Stats","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48820979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
StatsPub Date : 2023-08-11DOI: 10.3390/stats6030053
D. Politis, Kejin Wu
{"title":"Multi-Step-Ahead Prediction Intervals for Nonparametric Autoregressions via Bootstrap: Consistency, Debiasing, and Pertinence","authors":"D. Politis, Kejin Wu","doi":"10.3390/stats6030053","DOIUrl":"https://doi.org/10.3390/stats6030053","url":null,"abstract":"To address the difficult problem of the multi-step-ahead prediction of nonparametric autoregressions, we consider a forward bootstrap approach. Employing a local constant estimator, we can analyze a general type of nonparametric time-series model and show that the proposed point predictions are consistent with the true optimal predictor. We construct a quantile prediction interval that is asymptotically valid. Moreover, using a debiasing technique, we can asymptotically approximate the distribution of multi-step-ahead nonparametric estimation by the bootstrap. As a result, we can build bootstrap prediction intervals that are pertinent, i.e., can capture the model estimation variability, thus improving the standard quantile prediction intervals. Simulation studies are presented to illustrate the performance of our point predictions and pertinent prediction intervals for finite samples.","PeriodicalId":93142,"journal":{"name":"Stats","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46813674","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
StatsPub Date : 2023-08-04DOI: 10.3390/stats6030051
P. N. Rathie, L. Ozelim, Felipe Quintino, Tiago A. da Fonseca
{"title":"On the Extreme Value H-Function","authors":"P. N. Rathie, L. Ozelim, Felipe Quintino, Tiago A. da Fonseca","doi":"10.3390/stats6030051","DOIUrl":"https://doi.org/10.3390/stats6030051","url":null,"abstract":"In the present paper, a new special function, the so-called extreme value H-function, is introduced. This new function, which is a generalization of the H-function with a particular set of parameters, appears while dealing with products and quotients of a wide class of extreme value random variables. Some properties, special cases and a series representation are provided. Some statistical applications are also briefly discussed.","PeriodicalId":93142,"journal":{"name":"Stats","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47291323","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
StatsPub Date : 2023-07-31DOI: 10.3390/stats6030050
Gayan Warahena-Liyanage, B. Oluyede, Thatayaone Moakofi, Whatmore Sengweni
{"title":"The New Exponentiated Half Logistic-Harris-G Family of Distributions with Actuarial Measures and Applications","authors":"Gayan Warahena-Liyanage, B. Oluyede, Thatayaone Moakofi, Whatmore Sengweni","doi":"10.3390/stats6030050","DOIUrl":"https://doi.org/10.3390/stats6030050","url":null,"abstract":"In this study, we introduce a new generalized family of distributions called the Exponentiated Half Logistic-Harris-G (EHL-Harris-G) distribution, which extends the Harris-G distribution. The motivation for introducing this generalized family of distributions lies in its ability to overcome the limitations of previous families, enhance flexibility, improve tail behavior, provide better statistical properties and find applications in several fields. Several statistical properties, including hazard rate function, quantile function, moments, moments of residual life, distribution of the order statistics and Rényi entropy are discussed. Risk measures, such as value at risk, tail value at risk, tail variance and tail variance premium, are also derived and studied. To estimate the parameters of the EHL-Harris-G family of distributions, the following six different estimation approaches are used: maximum likelihood (MLE), least-squares (LS), weighted least-squares (WLS), maximum product spacing (MPS), Cramér–von Mises (CVM), and Anderson–Darling (AD). The Monte Carlo simulation results for EHL-Harris-Weibull (EHL-Harris-W) show that the MLE method allows us to obtain better estimates, followed by WLS and then AD. Finally, we show that the EHL-Harris-W distribution is superior to some other equi-parameter non-nested models in the literature, by fitting it to two real-life data sets from different disciplines.","PeriodicalId":93142,"journal":{"name":"Stats","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41721887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
StatsPub Date : 2023-07-27DOI: 10.3390/stats6030049
Zhiyi Zhang, Hongwei Huang, Hao Xu
{"title":"Khinchin’s Fourth Axiom of Entropy Revisited","authors":"Zhiyi Zhang, Hongwei Huang, Hao Xu","doi":"10.3390/stats6030049","DOIUrl":"https://doi.org/10.3390/stats6030049","url":null,"abstract":"The Boltzmann–Gibbs–Shannon (BGS) entropy is the only entropy form satisfying four conditions known as Khinchin’s axioms. The uniqueness theorem of the BGS entropy, plus the fact that Shannon’s mutual information completely characterizes independence between the two underlying random elements, puts the BGS entropy in a special place in many fields of study. In this article, the fourth axiom is replaced by a slightly weakened condition: an entropy whose associated mutual information is zero if and only if the two underlying random elements are independent. Under the weaker fourth axiom, other forms of entropy are sought by way of escort transformations. Two main results are reported in this article. First, there are many entropies other than the BGS entropy satisfying the weaker condition, yet retaining all the desirable utilities of the BGS entropy. Second, by way of escort transformations, the newly identified entropies are the only ones satisfying the weaker axioms.","PeriodicalId":93142,"journal":{"name":"Stats","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42382743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
StatsPub Date : 2023-07-05DOI: 10.3390/stats6030048
V. Distefano, Maria Mannone, I. Poli
{"title":"Exploring Heterogeneity with Category and Cluster Analyses for Mixed Data","authors":"V. Distefano, Maria Mannone, I. Poli","doi":"10.3390/stats6030048","DOIUrl":"https://doi.org/10.3390/stats6030048","url":null,"abstract":"Precision medicine aims to overcome the traditional one-model-fits-the-whole-population approach that is unable to detect heterogeneous disease patterns and make accurate personalized predictions. Heterogeneity is particularly relevant for patients with complications of type 2 diabetes, including diabetic kidney disease (DKD). We focus on a DKD longitudinal dataset, aiming to find specific subgroups of patients with characteristics that have a close response to the therapeutic treatment. We develop an approach based on some particular concepts of category theory and cluster analysis to explore individualized modelings and achieving insights onto disease evolution. This paper exploits the visualization tools provided by category theory, and bridges category-based abstract works and real datasets. We build subgroups deriving clusters of patients at different time points, considering a set of variables characterizing the state of patients. We analyze how specific variables affect the disease progress, and which drug combinations are more effective for each cluster of patients. The retrieved information can foster individualized strategies for DKD treatment.","PeriodicalId":93142,"journal":{"name":"Stats","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46930630","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
StatsPub Date : 2023-06-29DOI: 10.3390/stats6030047
L. Klebanov
{"title":"Some More Results on Characterization of the Exponential and Related Distributions","authors":"L. Klebanov","doi":"10.3390/stats6030047","DOIUrl":"https://doi.org/10.3390/stats6030047","url":null,"abstract":"There are given characterizations of the exponential distribution based on the properties of independence of linear forms with random coefficients. Results based on the constancy of regression of one statistic in a linear form are obtained. Related characterizations based on the property of the identical distribution of statistics are also provided.","PeriodicalId":93142,"journal":{"name":"Stats","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48494078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
StatsPub Date : 2023-06-26DOI: 10.3390/stats6030046
S. Caudill, F. Mixon
{"title":"Guess for Success? Application of a Mixture Model to Test-Wiseness on Multiple-Choice Exams","authors":"S. Caudill, F. Mixon","doi":"10.3390/stats6030046","DOIUrl":"https://doi.org/10.3390/stats6030046","url":null,"abstract":"The use of large lecture halls in business and economic education often dictates the use of multiple-choice exams to measure student learning. This study asserts that student performance on these types of exams can be viewed as the result of the process of elimination of incorrect answers, rather than the selection of the correct answer. More specifically, how students respond on a multiple-choice test can be broken down into the fractions of questions where no wrong answers can be eliminated (i.e., random guessing), one wrong answer can be eliminated, two wrong answers can be eliminated, and all wrong answers can be eliminated. The results from an empirical model, representing a mixture of binomials in which the probability of a correct choice depends on the number of incorrect choices eliminated, we find, using student performance data from a final exam in principles of microeconomics consisting of 100 multiple choice questions, that the responses to all of the questions on the exam can be characterized by some form of guessing, with more than 26 percent of questions being completed using purely random guessing.","PeriodicalId":93142,"journal":{"name":"Stats","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46034470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}