{"title":"Stable convergence of conditional least squares estimators for supercritical continuous state and continuous time branching processes with immigration","authors":"Mátyás Barczy","doi":"10.1016/j.jspi.2024.106213","DOIUrl":"10.1016/j.jspi.2024.106213","url":null,"abstract":"<div><p>We prove stable convergence of conditional least squares estimators of drift parameters for supercritical continuous state and continuous time branching processes with immigration based on discrete time observations.</p></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"235 ","pages":"Article 106213"},"PeriodicalIF":0.8,"publicationDate":"2024-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141937768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Trisha Dawn , Angshuman Roy , Alokesh Manna , Anil K. Ghosh
{"title":"Some clustering-based change-point detection methods applicable to high dimension, low sample size data","authors":"Trisha Dawn , Angshuman Roy , Alokesh Manna , Anil K. Ghosh","doi":"10.1016/j.jspi.2024.106212","DOIUrl":"10.1016/j.jspi.2024.106212","url":null,"abstract":"<div><p>Detection of change-points in a sequence of high dimensional observations is a challenging problem, and this becomes even more challenging when the sample size (i.e., the sequence length) is small. In this article, we propose some change-point detection methods based on clustering, which can be conveniently used in such high dimension, low sample size situations. First, we consider the single change-point problem. Using <span><math><mi>k</mi></math></span>-means clustering based on a suitable dissimilarity measures, we propose some methods for testing the existence of a change-point and estimating its location. High dimensional behavior of these proposed methods are investigated under appropriate regularity conditions. Next, we extend our methods for detection of multiple change-points. We carry out extensive numerical studies and analyze a real data set to compare the performance of our proposed methods with some state-of-the-art methods.</p></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"234 ","pages":"Article 106212"},"PeriodicalIF":0.8,"publicationDate":"2024-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141937763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Regression to the mean for overdispersed count data","authors":"Kiran Iftikhar , Manzoor Khan , Jake Olivier","doi":"10.1016/j.jspi.2024.106211","DOIUrl":"https://doi.org/10.1016/j.jspi.2024.106211","url":null,"abstract":"<div><p>In repeated measurements, regression to the mean (RTM) is a tendency of subjects with observed extreme values to move closer to the mean when measured a second time. Not accounting for RTM could lead to incorrect decisions such as when observed natural variation is incorrectly attributed to the effect of a treatment/intervention. A strategy for addressing RTM is to decompose the <em>total effect</em>, the expected difference in paired random variables conditional on the first being in the tail of its distribution, into regression to the mean and unbiased treatment effects. The unbiased treatment effect can then be estimated by subtraction. Formulae are available in the literature to quantify RTM for Poisson distributed data which are constrained by mean–variance equivalence, although there are many real life examples of overdispersed count data that are not well approximated by the Poisson. The negative binomial can be considered an explicit overdispersed Poisson process where the Poisson intensity is chosen from a gamma distribution. In this study, the truncated bivariate negative binomial distribution is used to decompose the total effect formulae into RTM and treatment effects. Maximum likelihood estimators (MLE) and method of moments estimators are developed for the total, RTM, and treatment effects. A simulation study is carried out to investigate the properties of the estimators and compare them with those developed under the assumption of the Poisson process. Data on the incidence of dengue cases reported from 2007 to 2017 are used to estimate the total, RTM, and treatment effects.</p></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"234 ","pages":"Article 106211"},"PeriodicalIF":0.8,"publicationDate":"2024-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141606665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Oracle-efficient estimation and global inferences for variance function of functional data","authors":"Li Cai , Suojin Wang","doi":"10.1016/j.jspi.2024.106210","DOIUrl":"https://doi.org/10.1016/j.jspi.2024.106210","url":null,"abstract":"<div><p>A new two-step reconstruction-based moment estimator and an asymptotically correct smooth simultaneous confidence band as a global inference tool are proposed for the heteroscedastic variance function of dense functional data. Step one involves spline smoothing for individual trajectory reconstructions and step two employs kernel regression on the individual squared residuals to estimate each trajectory variability. Then by the method of moment an estimator for the variance function of functional data is constructed. The estimation procedure is innovative by synthesizing spline smoothing and kernel regression together, which allows one not only to apply the fast computing speed of spline regression but also to employ the flexible local estimation and the extreme value theory of kernel smoothing. The resulting estimator for the variance function is shown to be oracle-efficient in the sense that it is uniformly as efficient as the ideal estimator when all trajectories were known by “oracle”. As a result, an asymptotically correct simultaneous confidence band for the variance function is established. Simulation results support our asymptotic theory with fast computation. As an illustration, the proposed method is applied to the analyses of two real data sets leading to a number of discoveries.</p></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"234 ","pages":"Article 106210"},"PeriodicalIF":0.8,"publicationDate":"2024-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141593789","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Column expanded Latin hypercube designs","authors":"Qiao Wei, Jian-Feng Yang, Min-Qian Liu","doi":"10.1016/j.jspi.2024.106208","DOIUrl":"https://doi.org/10.1016/j.jspi.2024.106208","url":null,"abstract":"<div><p>Maximin distance designs and orthogonal designs are extensively applied in computer experiments, but the construction of such designs is challenging, especially under the maximin distance criterion. In this paper, by adding columns to a fold-over optimal maximin <span><math><msub><mrow><mi>L</mi></mrow><mrow><mn>2</mn></mrow></msub></math></span>-distance Latin hypercube design (LHD), we construct a class of LHDs, called column expanded LHDs, which are nearly optimal under both the maximin <span><math><msub><mrow><mi>L</mi></mrow><mrow><mn>2</mn></mrow></msub></math></span>-distance and orthogonality criteria. The advantage of the proposed method is that the resulting designs have flexible numbers of factors without computer search. Detailed comparisons with existing LHDs show that the constructed LHDs have larger minimum distances between design points and smaller correlation coefficients between distinct columns.</p></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"234 ","pages":"Article 106208"},"PeriodicalIF":0.8,"publicationDate":"2024-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141541841","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The impact of misclassification on covariate-adaptive randomized clinical trials with generalized linear models","authors":"Tong Wang, Wei Ma","doi":"10.1016/j.jspi.2024.106209","DOIUrl":"https://doi.org/10.1016/j.jspi.2024.106209","url":null,"abstract":"<div><p>Covariate-adaptive randomization (CAR) is a type of randomization method that uses covariate information to enhance the comparability between different treatment groups. Under such randomization, the covariate is usually well balanced, i.e., the imbalance between the treatment group and placebo group is controlled. In practice, the covariate is sometimes misclassified. The covariate misclassification affects the CAR itself and statistical inferences after the CAR. In this paper, we examine the impact of covariate misclassification on CAR from two aspects. First, we study the balancing properties of CAR with unequal allocation in the presence of covariate misclassification. We show the convergence rate of the imbalance and compare it with that under true covariate. Second, we study the hypothesis test under CAR with misclassified covariates in a generalized linear model (GLM) framework. We consider both the unadjusted and adjusted models. To illustrate the theoretical results, we discuss the validity of test procedures for three commonly-used GLM, i.e., logistic regression, Poisson regression and exponential model. Specifically, we show that the adjusted model is often invalid when the misclassified covariates are adjusted. In this case, we provide a simple correction for the inflated Type-I error. The correction is useful and easy to implement because it does not require misclassification specification and estimation of the misclassification rate. Our study enriches the literature on the impact of covariate misclassification on CAR and provides a practical approach for handling misclassification.</p></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"234 ","pages":"Article 106209"},"PeriodicalIF":0.8,"publicationDate":"2024-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141593759","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A zero-estimator approach for estimating the signal level in a high-dimensional model-free setting","authors":"Ilan Livne, David Azriel, Yair Goldberg","doi":"10.1016/j.jspi.2024.106207","DOIUrl":"https://doi.org/10.1016/j.jspi.2024.106207","url":null,"abstract":"<div><p>We study a high-dimensional regression setting under the assumption of known covariate distribution. We aim at estimating the amount of explained variation in the response by the best linear function of the covariates (the signal level). In our setting, neither sparsity of the coefficient vector, nor normality of the covariates or linearity of the conditional expectation are assumed. We present an unbiased and consistent estimator and then improve it by using a zero-estimator approach, where a zero-estimator is a statistic whose expected value is zero. More generally, we present an algorithm based on the zero estimator approach that in principle can improve any given estimator. We study some asymptotic properties of the proposed estimators and demonstrate their finite sample performance in a simulation study.</p></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"234 ","pages":"Article 106207"},"PeriodicalIF":0.8,"publicationDate":"2024-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141482213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Layer sparsity in neural networks","authors":"Mohamed Hebiri , Johannes Lederer , Mahsa Taheri","doi":"10.1016/j.jspi.2024.106195","DOIUrl":"https://doi.org/10.1016/j.jspi.2024.106195","url":null,"abstract":"<div><p>Sparsity has become popular in machine learning because it can save computational resources, facilitate interpretations, and prevent overfitting. This paper discusses sparsity in the framework of neural networks. In particular, we formulate a new notion of sparsity, called layer sparsity, that concerns the networks’ layers and, therefore, aligns particularly well with the current trend toward deep networks. We then introduce corresponding regularization and refitting schemes that can complement standard deep-learning pipelines to generate more compact and accurate networks.</p></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"234 ","pages":"Article 106195"},"PeriodicalIF":0.9,"publicationDate":"2024-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0378375824000521/pdfft?md5=b1aa1392925da05f5ac50fc5d4831546&pid=1-s2.0-S0378375824000521-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141323230","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"High dimensional discriminant rules with shrinkage estimators of the covariance matrix and mean vector","authors":"Jaehoan Kim , Junyong Park , Hoyoung Park","doi":"10.1016/j.jspi.2024.106199","DOIUrl":"https://doi.org/10.1016/j.jspi.2024.106199","url":null,"abstract":"<div><p>Linear discriminant analysis (LDA) is a typical method for classification problems with large dimensions and small samples. There are various types of LDA methods that are based on the different types of estimators for the covariance matrices and mean vectors. In this paper, we consider shrinkage methods based on a non-parametric approach. For the precision matrix, methods based on the sparsity structure or data splitting are examined. Regarding the estimation of mean vectors, Non-parametric Empirical Bayes (NPEB) methods and Non-parametric Maximum Likelihood Estimation (NPMLE) methods, also known as <span><math><mi>f</mi></math></span>-modeling and <span><math><mi>g</mi></math></span>-modeling, respectively, are adopted. The performance of linear discriminant rules based on combined estimation strategies of the covariance matrix and mean vectors are analyzed in this study. Particularly, the study presents a theoretical result on the performance of the NPEB method and compares it with previous studies. Simulation studies with various covariance matrices and mean vector structures are conducted to evaluate the methods discussed in this paper. Furthermore, real data examples such as gene expressions and EEG data are also presented.</p></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"234 ","pages":"Article 106199"},"PeriodicalIF":0.9,"publicationDate":"2024-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141422982","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Mixed-integer linear programming for computing optimal experimental designs","authors":"Radoslav Harman, Samuel Rosa","doi":"10.1016/j.jspi.2024.106200","DOIUrl":"https://doi.org/10.1016/j.jspi.2024.106200","url":null,"abstract":"<div><p>The problem of computing an exact experimental design that is optimal for the least-squares estimation of the parameters of a regression model is considered. We show that this problem can be solved via mixed-integer linear programming (MILP) for a wide class of optimality criteria, including the criteria of A-, I-, G- and MV-optimality. This approach improves upon the current state-of-the-art mathematical programming formulation, which uses mixed-integer second-order cone programming. The key idea underlying the MILP formulation is McCormick relaxation, which critically depends on finite interval bounds for the elements of the covariance matrix of the least-squares estimator corresponding to an optimal exact design. We provide both analytic and algorithmic methods for constructing these bounds. We also demonstrate the unique advantages of the MILP approach, such as the possibility of incorporating multiple design constraints into the optimization problem, including constraints on the variances and covariances of the least-squares estimator.</p></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"234 ","pages":"Article 106200"},"PeriodicalIF":0.9,"publicationDate":"2024-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141323229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}