{"title":"A family of discrete maximum-entropy distributions","authors":"David J. Hessen","doi":"10.1016/j.jspi.2024.106243","DOIUrl":"10.1016/j.jspi.2024.106243","url":null,"abstract":"<div><div>In this paper, a family of maximum-entropy distributions with general discrete support is derived. Members of the family are distinguished by the number of specified non-central moments. In addition, a subfamily of discrete symmetric distributions is defined. Attention is paid to maximum likelihood estimation of the parameters of any member of the general family. It is shown that the parameters of any special case with infinite support can be estimated using a conditional distribution given a finite subset of the total support. In an empirical data example, the procedures proposed are demonstrated.</div></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"236 ","pages":"Article 106243"},"PeriodicalIF":0.8,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142416588","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Risk minimization using robust experimental or sampling designs and mixture of designs","authors":"Ejub Talovic, Yves Tillé","doi":"10.1016/j.jspi.2024.106241","DOIUrl":"10.1016/j.jspi.2024.106241","url":null,"abstract":"<div><div>For both experimental and sampling designs, the efficiency or balance of designs has been extensively studied. There are many ways to incorporate auxiliary information into designs. However, when we use balanced designs to decrease the variance due to an auxiliary variable, the variance may increase due to an effect which we define as lack of robustness. This robustness can be written as the largest eigenvalue of the variance operator of a sampling or experimental design. If this eigenvalue is large, then it might induce a large variance in the Horvitz–Thompson estimator of the total. We calculate or estimate the largest eigenvalue of the most common designs. We determine lower, upper bounds and approximations of this eigenvalue for different designs. Then, we compare these results with simulations that show the trade-off between efficiency and robustness. Those results can be used to determine the proper choice of designs for experiments such as clinical trials or surveys. We also propose a new and simple method for mixing two sampling designs, which allows to use a tuning parameter between two sampling designs. This method is then compared to the Gram–Schmidt walk design, which also governs the trade-off between robustness and efficiency. A set of simulation studies shows that our method of mixture gives similar results to the Gram–Schmidt walk design while having an interpretable variance matrix.</div></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"236 ","pages":"Article 106241"},"PeriodicalIF":0.8,"publicationDate":"2024-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142416589","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimal s-level fractional factorial designs under baseline parameterization","authors":"Zhaohui Yan, Shengli Zhao","doi":"10.1016/j.jspi.2024.106242","DOIUrl":"10.1016/j.jspi.2024.106242","url":null,"abstract":"<div><div>In this paper, we explore the minimum aberration criterion for <span><math><mi>s</mi></math></span>-level designs under baseline parameterization, called BP-MA. We give a complete search method and an incomplete search method to obtain the BP-MA (or nearly BP-MA) designs. The methodology has no restriction on <span><math><mi>s</mi></math></span>, the levels of the factors. The catalogues of (nearly) BP-MA designs with <span><math><mrow><mi>s</mi><mo>=</mo><mn>2</mn><mo>,</mo><mn>3</mn><mo>,</mo><mn>4</mn><mo>,</mo><mn>5</mn></mrow></math></span> levels are provided.</div></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"236 ","pages":"Article 106242"},"PeriodicalIF":0.8,"publicationDate":"2024-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142357419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Shifted BH methods for controlling false discovery rate in multiple testing of the means of correlated normals against two-sided alternatives","authors":"Sanat K. Sarkar, Shiyu Zhang","doi":"10.1016/j.jspi.2024.106238","DOIUrl":"10.1016/j.jspi.2024.106238","url":null,"abstract":"<div><div>For simultaneous testing of multivariate normal means with known correlation matrix against two-sided alternatives, this paper introduces new methods with proven finite-sample control of false discovery rate. The methods are obtained by shifting each <span><math><mi>p</mi></math></span>-value to the left and considering a Benjamini–Hochberg-type linear step-up procedure based on these shifted <span><math><mi>p</mi></math></span>-values. The amount of shift for each <span><math><mi>p</mi></math></span>-value is appropriately determined from the correlation matrix to achieve the desired false discovery rate control. Simulation studies and real-data application show favorable performances of the proposed methods when compared with relevant competitors.</div></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"236 ","pages":"Article 106238"},"PeriodicalIF":0.8,"publicationDate":"2024-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142323239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On schematic orthogonal arrays of high strength","authors":"Rong Yan, Shanqi Pang, Jing Wang, Mengqian Chen","doi":"10.1016/j.jspi.2024.106230","DOIUrl":"10.1016/j.jspi.2024.106230","url":null,"abstract":"<div><p>Schematic orthogonal arrays are closely related to association schemes. And which orthogonal arrays are schematic orthogonal arrays and how to classify them is an open problem proposed by Hedayat et al. (1999). By using the Hamming distances, this paper presents some general methods for constructing schematic symmetric and mixed orthogonal arrays of high strength. As applications of these methods, we construct association schemes and many new schematic orthogonal arrays including several infinite classes of such arrays. Some examples are provided to illustrate the construction methods. The paper gives the partial solution of the problem by Hedayat et al. (1999) for symmetric and mixed orthogonal arrays of high strength.</p></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"236 ","pages":"Article 106230"},"PeriodicalIF":0.8,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142162837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Becky Tang , Henry A. Frye , John A. Silander Jr. , Alan E. Gelfand
{"title":"Zero-inflated multivariate tobit regression modeling","authors":"Becky Tang , Henry A. Frye , John A. Silander Jr. , Alan E. Gelfand","doi":"10.1016/j.jspi.2024.106229","DOIUrl":"10.1016/j.jspi.2024.106229","url":null,"abstract":"<div><p>A frequent challenge encountered in real-world applications is data having a high proportion of zeros. Focusing on ecological abundance data, much attention has been given to zero-inflated count data. Models for non-negative continuous abundance data with an excess of zeros are rarely discussed. Work presented here considers the creation of a point mass at zero through a left-censoring approach or through a hurdle approach. We incorporate both mechanisms to capture the analog of zero-inflation for count data. Additionally, primary attention has been given to univariate zero-inflated modeling (e.g., single species), whereas data often arise jointly (e.g., a collection of species). With multivariate abundance data, a key issue is to capture dependence among the species at a site, both in terms of positive abundance as well as absence. Therefore, our contribution is a model for multivariate zero-inflated continuous data that are non-negative. Working in a Bayesian framework, we discuss the issue of separating the two sources of zeros and offer model comparison metrics for multivariate zero-inflated data. In an application, we model the total biomass for five tree species obtained from plots established in the Forest Inventory Analysis database in the Northeast region of the United States.</p></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"236 ","pages":"Article 106229"},"PeriodicalIF":0.8,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142150410","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Convergent stochastic algorithm for estimation in general multivariate correlated frailty models using integrated partial likelihood","authors":"Ajmal Oodally , Luc Duchateau , Estelle Kuhn","doi":"10.1016/j.jspi.2024.106231","DOIUrl":"10.1016/j.jspi.2024.106231","url":null,"abstract":"<div><p>The Cox model with unspecified baseline hazard is often used to model survival data. In the case of correlated event times, this model can be extended by introducing random effects, also called frailty terms, leading to the frailty model. Few methods have been put forward to estimate parameters of such frailty models, and they often consider only a particular distribution for the frailty terms and specific correlation structures. In this paper, a new efficient method is introduced to perform parameter estimation by maximizing the integrated partial likelihood. The proposed stochastic estimation procedure can deal with frailty models with a broad choice of distributions for the frailty terms and with any kind of correlation structure between the frailty components, also allowing random interaction terms between the covariates and the frailty components. The almost sure convergence of the stochastic estimation algorithm towards a critical point of the integrated partial likelihood is proved. Numerical convergence properties are evaluated through simulation studies and comparison with existing methods is performed. In particular, the robustness of the proposed method with respect to different parametric baseline hazards and misspecified frailty distributions is demonstrated through simulation. Finally, the method is applied to a mastitis and a bladder cancer dataset.</p></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"236 ","pages":"Article 106231"},"PeriodicalIF":0.8,"publicationDate":"2024-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142162836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Effect of dimensionality on convergence rates of kernel ridge regression estimator","authors":"Kwan-Young Bak , Woojoo Lee","doi":"10.1016/j.jspi.2024.106228","DOIUrl":"10.1016/j.jspi.2024.106228","url":null,"abstract":"<div><div>Despite the curse of dimensionality, kernel ridge regression often exhibits good performance in practical applications, even when the dimension is moderately large. However, it has been shown that kernel ridge regression cannot be free from the curse of dimensionality. Until now, the literature on kernel ridge regression has suggested that the gap between theory and practice in relation to dimensionality has not narrowed. In this study, we first investigate when the influence of dimensionality does not significantly affect the convergence rate of the kernel ridge regression. Specifically, we study the convergence rate of <span><math><msub><mrow><mi>L</mi></mrow><mrow><mn>2</mn></mrow></msub></math></span> and <span><math><msub><mrow><mi>L</mi></mrow><mrow><mi>∞</mi></mrow></msub></math></span> risks for the kernel ridge estimator, with a focus on reproducing kernel Hilbert space (RKHS) generated by a product kernel. We show that the univariate optimal convergence rate up to a logarithmic factor in <span><math><msub><mrow><mi>L</mi></mrow><mrow><mn>2</mn></mrow></msub></math></span> and <span><math><msub><mrow><mi>L</mi></mrow><mrow><mi>∞</mi></mrow></msub></math></span> risks can be achieved by controlling the size of the RKHS. The result of a numerical study confirms our theoretical findings.</div></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"236 ","pages":"Article 106228"},"PeriodicalIF":0.8,"publicationDate":"2024-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142180125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bayes oracle property of multiple tests of multivariate normal means under sparsity","authors":"Zikun Qin, Malay Ghosh","doi":"10.1016/j.jspi.2024.106227","DOIUrl":"10.1016/j.jspi.2024.106227","url":null,"abstract":"<div><p>The paper considers a multiple testing problem of multivariate normal means under sparsity. First, the Bayes risk of the multivariate Bayes oracle is derived. Then, a hierarchical Bayesian approach is taken with global–local shrinkage priors, where the global parameter is either treated as a tuning parameter or is given a specific prior. The method is shown to attain an asymptotic Bayes optimal under sparsity (ABOS) property. Finally, an empirical Bayes procedure is proposed which involves estimation of the global shrinkage parameter. The approach is also shown to lead to the ABOS property.</p></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"235 ","pages":"Article 106227"},"PeriodicalIF":0.8,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142088421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Assessing heterogeneity in treatment initiation guidelines in longitudinal randomized controlled trials","authors":"Hyunkeun Ryan Cho , Seonjin Kim","doi":"10.1016/j.jspi.2024.106226","DOIUrl":"10.1016/j.jspi.2024.106226","url":null,"abstract":"<div><p>Treatment initiation guidelines are essential in healthcare, dictating when patients begin therapy. These guidelines are typically assessed through randomized controlled trials (RCTs) to measure their average effect on a population. However, this method may not fully account for patient heterogeneity. We introduce a refined analysis methodology that accounts for diverse times to treatment initiation (TTI) arising from these guidelines. We offer a more detailed perspective on the guidelines’ impact by analyzing homogeneous subpopulations based on their TTI. We develop a longitudinal regression model with smooth time functions to capture dynamic changes in average guideline effects on subpopulations (AGES). A unique weighting mechanism creates pseudo-subpopulations from RCT data, enabling consistent and precise estimation of smooth functions. The efficacy of our approach is validated through theoretical and numerical studies, underscoring its capacity to provide insightful statistical inferences. We exemplify the utility of our methodology by applying it to an RCT of the World Health Organization (WHO) guideline for adults with HIV. This analysis promises to enhance the evaluation of treatment initiation guidelines, leading to more personalized and efficient patient care.</p></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"235 ","pages":"Article 106226"},"PeriodicalIF":0.8,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141993852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}