Wei Wang, Linjiang Li, Sheng Li, F. Yin, Fang Liao, Zhang Tao, Xiaosong Li, Xiong Xiao, Yue Ma
{"title":"Average ordinary least squares‐centered penalized regression: A more efficient way to address multicollinearity than ridge regression","authors":"Wei Wang, Linjiang Li, Sheng Li, F. Yin, Fang Liao, Zhang Tao, Xiaosong Li, Xiong Xiao, Yue Ma","doi":"10.1111/stan.12263","DOIUrl":"https://doi.org/10.1111/stan.12263","url":null,"abstract":"We developed a novel method to address multicollinearity in linear models called average ordinary least squares (OLS)‐centered penalized regression (AOPR). AOPR penalizes the cost function to shrink the estimators toward the weighted‐average OLS estimator. The commonly used ridge regression (RR) shrinks the estimators toward zero, that is, employs penalization prior β∼N(0,1/k) in the Bayesian view, which contradicts the common real prior β≠0 . Therefore, RR selects small penalization coefficients to relieve such a contradiction and thus makes the penalizations inadequate. Mathematical derivations remind us that AOPR could increase the performance of RR and OLS regression. A simulation study shows that AOPR obtains more accurate estimators than OLS regression in most situations and more accurate estimators than RR when the signs of the true β s are identical and is slightly less accurate than RR when the signs of the true β s are different. Additionally, a case study shows that AOPR obtains more stable estimators and stronger statistical power and predictive ability than RR and OLS regression. Through these results, we recommend using AOPR to address multicollinearity more efficiently than RR and OLS regression, especially when the true β s have identical signs.","PeriodicalId":51178,"journal":{"name":"Statistica Neerlandica","volume":"88 1","pages":"347 - 368"},"PeriodicalIF":1.5,"publicationDate":"2022-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81131506","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Prediction intervals for all of M future observations based on linear random effects models","authors":"M. Menssen, F. Schaarschmidt","doi":"10.1111/stan.12260","DOIUrl":"https://doi.org/10.1111/stan.12260","url":null,"abstract":"In many pharmaceutical and biomedical applications such as assay validation, assessment of historical control data, or the detection of anti‐drug antibodies, the calculation and interpretation of prediction intervals (PI) is of interest. The present study provides two novel methods for the calculation of prediction intervals based on linear random effects models and restricted maximum likelihood (REML) estimation. Unlike other REML‐based PI found in the literature, both intervals reflect the uncertainty related with the estimation of the prediction variance. The first PI is based on Satterthwaite approximation. For the other PI, a bootstrap calibration approach that we will call quantile‐calibration was used. Due to the calibration process this PI can be easily computed for more than one future observation and based on balanced and unbalanced data as well. In order to compare the coverage probabilities of the proposed PI with those of four intervals found in the literature, Monte Carlo simulations were run for two relatively complex random effects models and a broad range of parameter settings. The quantile‐calibrated PI was implemented in the statistical software R and is available in the predint package.","PeriodicalId":51178,"journal":{"name":"Statistica Neerlandica","volume":"66 1","pages":"283 - 308"},"PeriodicalIF":1.5,"publicationDate":"2021-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77968796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
O. Ariyo, E. Lesaffre, G. Verbeke, M. Huisman, Judith Rijnhart, Martijn Heymans, J. Twisk
{"title":"Bayesian model selection for multilevel mediation models","authors":"O. Ariyo, E. Lesaffre, G. Verbeke, M. Huisman, Judith Rijnhart, Martijn Heymans, J. Twisk","doi":"10.1111/stan.12256","DOIUrl":"https://doi.org/10.1111/stan.12256","url":null,"abstract":"Mediation analysis is often used to explore the complex relationship between two variables through a third mediating variable. This paper aims to illustrate the performance of the deviance information criterion, the pseudo‐Bayes factor, and the Watanabe–Akaike information criterion in selecting the appropriate multilevel mediation model. Our focus will be on comparing the conditional criteria (given random effects) versus the marginal criteria (averaged over random effects) in this respect. Most of the previous work on the multilevel mediation models fails to report the poor behavior of the conditional criteria. We demonstrate here the superiority of the marginal version of the selection criteria over their conditional counterpart in the mediated longitudinal settings through simulation studies and via an application to data from the Longitudinal Aging Study of the Amsterdam study. In addition, we demonstrate the usefulness of our self‐written R function for multilevel mediation models.","PeriodicalId":51178,"journal":{"name":"Statistica Neerlandica","volume":"49 1","pages":"219 - 235"},"PeriodicalIF":1.5,"publicationDate":"2021-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75993438","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Competing risks regression for clustered survival data via the marginal additive subdistribution hazards model","authors":"Xinyuan Chen, D. Esserman, Fan Li","doi":"10.1111/stan.12317","DOIUrl":"https://doi.org/10.1111/stan.12317","url":null,"abstract":"A population‐averaged additive subdistribution hazards model is proposed to assess the marginal effects of covariates on the cumulative incidence function and to analyze correlated failure time data subject to competing risks. This approach extends the population‐averaged additive hazards model by accommodating potentially dependent censoring due to competing events other than the event of interest. Assuming an independent working correlation structure, an estimating equations approach is outlined to estimate the regression coefficients and a new sandwich variance estimator is proposed. The proposed sandwich variance estimator accounts for both the correlations between failure times and between the censoring times, and is robust to misspecification of the unknown dependency structure within each cluster. We further develop goodness‐of‐fit tests to assess the adequacy of the additive structure of the subdistribution hazards for the overall model and each covariate. Simulation studies are conducted to investigate the performance of the proposed methods in finite samples. We illustrate our methods using data from the STrategies to Reduce Injuries and Develop confidence in Elders (STRIDE) trial.This article is protected by copyright. All rights reserved.","PeriodicalId":51178,"journal":{"name":"Statistica Neerlandica","volume":"58 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2021-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90297672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Joint probabilities under expected value constraints, transportation problems, maximum entropy in the mean","authors":"H. Gzyl, Silvia Mayoral","doi":"10.1111/stan.12314","DOIUrl":"https://doi.org/10.1111/stan.12314","url":null,"abstract":"There are interesting extensions of the problem of determining a joint probability with known marginals. On the one hand, one may impose size constraints on the joint probabilities. On the other, one may impose additional constraints like the expected values of known random variables. If we think of the marginal probabilities as demands or supplies, and of the joint probability as the fraction of the supplies to be shipped from the production sites to the demand sites, instead of joint probabilities we can think of transportation policies. Clearly, fixing the cost of a transportation policy is equivalent to an integral constraints upon the joint probability. We will show how to solve the cost constrained transportation problem by means of the method of maximum entropy in the mean. We shall also show how this approach leads to an interior point like method to solve the associated linear programming problem. We shall also investigate some geometric structure the space of transportation policies, or joint probabilities or pixel space, using a Riemannian structure associated with the dual of the entropy used to determine bounds between probabilities or between transportation policies.","PeriodicalId":51178,"journal":{"name":"Statistica Neerlandica","volume":"20 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2021-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81947429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Logistic or not Logistic?","authors":"J. Allison, B. Ebner, M. Smuts","doi":"10.1111/stan.12292","DOIUrl":"https://doi.org/10.1111/stan.12292","url":null,"abstract":"We propose a new class of goodness‐of‐fit tests for the logistic distribution based on a characterization related to the density approach in the context of Stein's method. This characterization‐based test is a first of its kind for the logistic distribution. The asymptotic null distribution of the test statistic is derived and it is shown that the test is consistent against fixed alternatives. The finite sample power performance of the newly proposed class of tests is compared to various existing tests by means of a Monte Carlo study. It is found that this new class of tests are especially powerful when the alternative distributions are heavy tailed, like Student's t and Cauchy, or for skew alternatives such as the log‐normal, gamma and chi‐square distributions.","PeriodicalId":51178,"journal":{"name":"Statistica Neerlandica","volume":"74 1","pages":"429 - 443"},"PeriodicalIF":1.5,"publicationDate":"2021-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80832521","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bootstrap for integer‐valued GARCH(p, q) processes","authors":"M. Neumann","doi":"10.1111/stan.12238","DOIUrl":"https://doi.org/10.1111/stan.12238","url":null,"abstract":"We consider integer‐valued processes with a linear or nonlinear generalized autoregressive conditional heteroscedastic models structure, where the count variables given the past follow a Poisson distribution. We show that a contraction condition imposed on the intensity function yields a contraction property of the Markov kernel of the process. This allows almost effortless proofs of the existence and uniqueness of a stationary distribution as well as of absolute regularity of the count process. As our main result, we construct a coupling of the original process and a model‐based bootstrap counterpart. Using a contraction property of the Markov kernel of the coupled process we obtain bootstrap consistency for different types of statistics.","PeriodicalId":51178,"journal":{"name":"Statistica Neerlandica","volume":"7 1","pages":"343 - 363"},"PeriodicalIF":1.5,"publicationDate":"2021-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84383409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Goodness‐of‐fit tests for Poisson count time series based on the Stein–Chen identity","authors":"Boris Aleksandrov, C. Weiß, C. Jentsch","doi":"10.1111/stan.12252","DOIUrl":"https://doi.org/10.1111/stan.12252","url":null,"abstract":"To test the null hypothesis of a Poisson marginal distribution, test statistics based on the Stein–Chen identity are proposed. For a wide class of Poisson count time series, the asymptotic distribution of different types of Stein–Chen statistics is derived, also if multiple statistics are jointly applied. The performance of the tests is analyzed with simulations, as well as the question which Stein–Chen functions should be used for which alternative. Illustrative data examples are presented, and possible extensions of the novel Stein–Chen approach are discussed as well.","PeriodicalId":51178,"journal":{"name":"Statistica Neerlandica","volume":"107 1","pages":"35 - 64"},"PeriodicalIF":1.5,"publicationDate":"2021-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88392751","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The basic distributional theory for the product of zero mean correlated normal random variables","authors":"Robert E. Gaunt","doi":"10.1111/stan.12267","DOIUrl":"https://doi.org/10.1111/stan.12267","url":null,"abstract":"The product of two zero mean correlated normal random variables, and more generally the sum of independent copies of such random variables, has received much attention in the statistics literature and appears in many application areas. However, many important distributional properties are yet to be recorded. This review paper fills this gap by providing the basic distributional theory for the sum of independent copies of the product of two zero mean correlated normal random variables. Properties covered include probability and cumulative distribution functions, generating functions, moments and cumulants, mode and median, Stein characterisations, representations in terms of other random variables, and a list of related distributions. We also review how the product of two zero mean correlated normal random variables arises naturally as a limiting distribution, with an example given for the distributional approximation of double Wiener‐Itô integrals.","PeriodicalId":51178,"journal":{"name":"Statistica Neerlandica","volume":"32 1","pages":"450 - 470"},"PeriodicalIF":1.5,"publicationDate":"2021-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76103266","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On the conditional noncentral beta distribution","authors":"C. Orsi","doi":"10.1111/stan.12249","DOIUrl":"https://doi.org/10.1111/stan.12249","url":null,"abstract":"The beta family owes its privileged status within unit interval distributions to several relevant features such as, for example, easiness of interpretation and versatility in modeling different types of data. However, the flexibility of its density at the endpoints of the support is poor enough to prevent from properly modeling the data portions having values next to zero and one. Such a drawback can be overcome by resorting to the class of the noncentral beta distributions. Indeed, the latter allows the density to take on arbitrary positive and finite limits which have a really simple form. Nevertheless, the analytical and mathematical complexity of this distribution poses strong limitations on its use as a model for data on the real interval (0, 1). That said, an in‐depth study of a newly found analogue of the noncentral beta distribution is carried out in this article. The latter preserves the applicative potential of the standard noncentral beta class but with the advantage of showing a more straightforward and easily handleable density.","PeriodicalId":51178,"journal":{"name":"Statistica Neerlandica","volume":"68 1","pages":"164 - 189"},"PeriodicalIF":1.5,"publicationDate":"2021-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80063061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}