StatsPub Date : 2023-12-05DOI: 10.3390/stats6040081
Aris Spanos
{"title":"Revisiting the Large n (Sample Size) Problem: How to Avert Spurious Significance Results","authors":"Aris Spanos","doi":"10.3390/stats6040081","DOIUrl":"https://doi.org/10.3390/stats6040081","url":null,"abstract":"Although large data sets are generally viewed as advantageous for their ability to provide more precise and reliable evidence, it is often overlooked that these benefits are contingent upon certain conditions being met. The primary condition is the approximate validity (statistical adequacy) of the probabilistic assumptions comprising the statistical model Mθ(x) applied to the data. In the case of a statistically adequate Mθ(x) and a given significance level α, as n increases, the power of a test increases, and the p-value decreases due to the inherent trade-off between type I and type II error probabilities in frequentist testing. This trade-off raises concerns about the reliability of declaring ‘statistical significance’ based on conventional significance levels when n is exceptionally large. To address this issue, the author proposes that a principled approach, in the form of post-data severity (SEV) evaluation, be employed. The SEV evaluation represents a post-data error probability that converts unduly data-specific ‘accept/reject H0 results’ into evidence either supporting or contradicting inferential claims regarding the parameters of interest. This approach offers a more nuanced and robust perspective in navigating the challenges posed by the large n problem.","PeriodicalId":93142,"journal":{"name":"Stats","volume":"68 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138598495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Process Monitoring Using Truncated Gamma Distribution","authors":"Sajid Ali, Shayaan Rajput, Ismail Shah, Hassan Houmani","doi":"10.3390/stats6040080","DOIUrl":"https://doi.org/10.3390/stats6040080","url":null,"abstract":"The time-between-events idea is commonly used for monitoring high-quality processes. This study aims to monitor the increase and/or decrease in the process mean rapidly using a one-sided exponentially weighted moving average (EWMA) chart for the detection of upward or downward mean shifts using a truncated gamma distribution. The use of the truncation method helps to enhance and improve the sensitivity of the proposed chart. The performance of the proposed chart with known and estimated parameters is analyzed by using the run length properties, including the average run length (ARL) and standard deviation run length (SDRL), through extensive Monte Carlo simulation. The numerical results show that the proposed scheme is more sensitive than the existing ones. Finally, the chart is implemented in real-world situations to highlight the significance of the proposed chart.","PeriodicalId":93142,"journal":{"name":"Stats","volume":" 5","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138613377","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
StatsPub Date : 2023-11-29DOI: 10.3390/stats6040079
A. Adebanji, Franz Aschl, Ednah Chepkemoi Chumo, Emmanuel Odame Owiredu, Johannes Müller, Tukae Mbegalo
{"title":"Social Response and Measles Dynamics","authors":"A. Adebanji, Franz Aschl, Ednah Chepkemoi Chumo, Emmanuel Odame Owiredu, Johannes Müller, Tukae Mbegalo","doi":"10.3390/stats6040079","DOIUrl":"https://doi.org/10.3390/stats6040079","url":null,"abstract":"Measles remains one of the leading causes of death among young children globally, even though a safe and cost-effective vaccine is available. Vaccine hesitancy and social response to vaccination continue to undermine efforts to eradicate measles. In this study, we consider data about measles vaccination and measles prevalence in Germany for the years 2008–2012 in 345 districts. In the first part of the paper, we show that the probability of a local outbreak does not significantly depend on the vaccination coverage, but—if an outbreak does take place—the scale of the outbreak depends significantly on the vaccination coverage. Additionally, we show that the willingness to be vaccinated is significantly increased by local outbreaks, with a delay of about one year. In the second part of the paper, we consider a deterministic delay model to investigate the consequences of the statistical findings on the dynamics of the infection. Here, we find that the delay might induce oscillations if the vaccination coverage is rather low and the social response to an outbreak is sufficiently strong. The relevance of our findings is discussed at the end of the paper.","PeriodicalId":93142,"journal":{"name":"Stats","volume":"3 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139212369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
StatsPub Date : 2023-11-21DOI: 10.3390/stats6040078
R. Guerra, Fernando A. Peña-Ramírez, G. Cordeiro
{"title":"The Logistic Burr XII Distribution: Properties and Applications to Income Data","authors":"R. Guerra, Fernando A. Peña-Ramírez, G. Cordeiro","doi":"10.3390/stats6040078","DOIUrl":"https://doi.org/10.3390/stats6040078","url":null,"abstract":"We define and study the four-parameter logistic Burr XII distribution. It is obtained by inserting the three-parameter Burr XII distribution as the baseline in the logistic-X family and may be a useful alternative method to model income distribution and could be applied to other areas. We illustrate that the new distribution can have decreasing and upside-down-bathtub hazard functions and that its density function is an infinite linear combination of Burr XII densities. Some mathematical properties of the proposed model are determined, such as the quantile function, ordinary and incomplete moments, and generating function. We also obtain the maximum likelihood estimators of the model parameters and perform a Monte Carlo simulation study. Further, we present a parametric regression model based on the introduced distribution as an alternative to the location-scale regression model. The potentiality of the new distribution is illustrated by means of two applications to income data sets.","PeriodicalId":93142,"journal":{"name":"Stats","volume":"92 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139252783","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
StatsPub Date : 2023-11-11DOI: 10.3390/stats6040077
Javier Linkolk López-Gonzales, Ana María Gómez Lamus, Romina Torres, Paulo Canas Rodrigues, Rodrigo Salas
{"title":"Self-Organizing Topological Multilayer Perceptron: A Hybrid Method to Improve the Forecasting of Extreme Pollution Values","authors":"Javier Linkolk López-Gonzales, Ana María Gómez Lamus, Romina Torres, Paulo Canas Rodrigues, Rodrigo Salas","doi":"10.3390/stats6040077","DOIUrl":"https://doi.org/10.3390/stats6040077","url":null,"abstract":"Forecasting air pollutant levels is essential in regulatory plans focused on controlling and mitigating air pollutants, such as particulate matter. Focusing the forecast on air pollution peaks is challenging and complex since the pollutant time series behavior is not regular and is affected by several environmental and urban factors. In this study, we propose a new hybrid method based on artificial neural networks to forecast daily extreme events of PM2.5 pollution concentration. The hybrid method combines self-organizing maps to identify temporal patterns of excessive daily pollution found at different monitoring stations, with a set of multilayer perceptron to forecast extreme values of PM2.5 for each cluster. The proposed model was applied to analyze five-year pollution data obtained from nine weather stations in the metropolitan area of Santiago, Chile. Simulation results show that the hybrid method improves performance metrics when forecasting daily extreme values of PM2.5.","PeriodicalId":93142,"journal":{"name":"Stats","volume":"39 7","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135086869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
StatsPub Date : 2023-11-04DOI: 10.3390/stats6040076
Célestin C. Kokonendji, Sobom M. Somé, Youssef Esstafa, Marcelo Bourguignon
{"title":"On Underdispersed Count Kernels for Smoothing Probability Mass Functions","authors":"Célestin C. Kokonendji, Sobom M. Somé, Youssef Esstafa, Marcelo Bourguignon","doi":"10.3390/stats6040076","DOIUrl":"https://doi.org/10.3390/stats6040076","url":null,"abstract":"Only a few count smoothers are available for the widespread use of discrete associated kernel estimators, and their constructions lack systematic approaches. This paper proposes the mean dispersion technique for building count kernels. It is only applicable to count distributions that exhibit the underdispersion property, which ensures the convergence of the corresponding estimators. In addition to the well-known binomial and recent CoM-Poisson kernels, we introduce two new ones such the double Poisson and gamma-count kernels. Despite the challenging problem of obtaining explicit expressions, these kernels effectively smooth densities. Their good performances are pointed out from both numerical and comparative analyses, particularly for small and moderate sample sizes. The optimal tuning parameter is here investigated by integrated squared errors. Also, the added advantage of faster computation times is really very interesting. Thus, the overall accuracy of two newly suggested kernels appears to be between the two old ones. Finally, an application including a tail probability estimation on a real count data and some concluding remarks are given.","PeriodicalId":93142,"journal":{"name":"Stats","volume":"39 26","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135773638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
StatsPub Date : 2023-11-02DOI: 10.3390/stats6040075
Vladimir Kovtun, Avi Giloni, Clifford Hurvich, Sridhar Seshadri
{"title":"Pivot Clustering to Minimize Error in Forecasting Aggregated Demand Streams Each Following an Autoregressive Moving Average Model","authors":"Vladimir Kovtun, Avi Giloni, Clifford Hurvich, Sridhar Seshadri","doi":"10.3390/stats6040075","DOIUrl":"https://doi.org/10.3390/stats6040075","url":null,"abstract":"In this paper, we compare the effects of forecasting demand using individual (disaggregated) components versus first aggregating the components either fully or into several clusters. Demand streams are assumed to follow autoregressive moving average (ARMA) processes. Using individual demand streams will always lead to a superior forecast compared to any aggregates; however, we show that if several aggregated clusters are formed in a structured manner, then these subaggregated clusters will lead to a forecast with minimal increase in mean-squared forecast error. We show this result based on theoretical MSFE obtained directly from the models generating the clusters as well as estimated MSFE obtained directly from simulated demand observations. We suggest a pivot algorithm, which we call Pivot Clustering, to create these clusters. We also provide theoretical results to investigate sub-aggregation, including for special cases, such as aggregating demand generated by MA(1) models and aggregating demand generated by ARMA models with similar or the same parameters.","PeriodicalId":93142,"journal":{"name":"Stats","volume":"11 6","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135973024","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
StatsPub Date : 2023-11-01DOI: 10.3390/stats6040074
Gebrenegus Ghilagaber, Rolf Larsson
{"title":"Adjustment of Anticipatory Covariates in Retrospective Surveys: An Expected Likelihood Approach","authors":"Gebrenegus Ghilagaber, Rolf Larsson","doi":"10.3390/stats6040074","DOIUrl":"https://doi.org/10.3390/stats6040074","url":null,"abstract":"We address an inference issue where the value of a covariate is measured at the date of the survey but is used to explain behavior that has occurred long before the survey. This causes bias because the value of the covariate does not follow the temporal order of events. We propose an expected likelihood approach to adjust for such bias and illustrate it with data on the effects of educational level achieved by the time of marriage on risks of divorce. For individuals with anticipatory educational level (whose reported educational level was completed after marriage), conditional probabilities of having attained the reported level before marriage are computed. These are then used as weights in the expected likelihood to obtain adjusted estimates of relative risks. For our illustrative data set, the adjusted estimates of relative risks of divorce did not differ significantly from those obtained from anticipatory analysis that ignores the temporal order of events. Our results are slightly different from those in two other studies that analyzed the same data set in a Bayesian framework, though the studies are not fully comparable to each other.","PeriodicalId":93142,"journal":{"name":"Stats","volume":"2 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135271487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
StatsPub Date : 2023-10-25DOI: 10.3390/stats6040073
Alexander Robitzsch
{"title":"Implementation Aspects in Invariance Alignment","authors":"Alexander Robitzsch","doi":"10.3390/stats6040073","DOIUrl":"https://doi.org/10.3390/stats6040073","url":null,"abstract":"In social sciences, multiple groups, such as countries, are frequently compared regarding a construct that is assessed using a number of items administered in a questionnaire. The corresponding scale is assessed with a unidimensional factor model involving a latent factor variable. To enable a comparison of the mean and standard deviation of the factor variable across groups, identification constraints on item intercepts and factor loadings must be imposed. Invariance alignment (IA) provides such a group comparison in the presence of partial invariance (i.e., a minority of item intercepts and factor loadings are allowed to differ across groups). IA is a linking procedure that separately fits a factor model in each group in the first step. In the second step, a linking of estimated item intercepts and factor loadings is conducted using a robust loss function L0.5. The present article discusses implementation alternatives in IA. It compares the default L0.5 loss function with Lp with other values of the power p between 0 and 1. Moreover, the nondifferentiable Lp loss functions are replaced with differentiable approximations in the estimation of IA that depend on a tuning parameter ε (such as, e.g., ε=0.01). The consequences of choosing different values of ε are discussed. Moreover, this article proposes the L0 loss function with a differentiable approximation for IA. Finally, it is demonstrated that the default linking function in IA introduces bias in estimated means and standard deviations if there is noninvariance in factor loadings. Therefore, an alternative linking function based on logarithmized factor loadings is examined for estimating factor means and standard deviations. The implementation alternatives are compared through three simulation studies. It turned out that the linking function for factor loadings in IA should be replaced by the alternative involving logarithmized factor loadings. Furthermore, the default L0.5 loss function is inferior to the newly proposed L0 loss function regarding the bias and root mean square error of factor means and standard deviations.","PeriodicalId":93142,"journal":{"name":"Stats","volume":"64 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135168392","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
StatsPub Date : 2023-10-25DOI: 10.3390/stats6040072
Szilárd Nemes
{"title":"Asymptotic Relative Efficiency of Parametric and Nonparametric Survival Estimators","authors":"Szilárd Nemes","doi":"10.3390/stats6040072","DOIUrl":"https://doi.org/10.3390/stats6040072","url":null,"abstract":"The dominance of non- and semi-parametric methods in survival analysis is not without criticism. Several studies have highlighted the decrease in efficiency compared to parametric methods. We revisit the problem of Asymptotic Relative Efficiency (ARE) of the Kaplan–Meier survival estimator compared to parametric survival estimators. We begin by generalizing Miller’s approach and presenting a formula that enables the estimation (numerical or exact) of ARE for various survival distributions and types of censoring. We examine the effect of follow-up time and censoring on ARE. The article concludes with a discussion about the reasons behind the lower and time-dependent ARE of the Kaplan–Meier survival estimator.","PeriodicalId":93142,"journal":{"name":"Stats","volume":"32 7","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135216538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}