{"title":"The two-sample location shift model under log-concavity","authors":"Riddhiman Saha , Priyam Das , Nilanjana Laha","doi":"10.1016/j.jspi.2025.106272","DOIUrl":"10.1016/j.jspi.2025.106272","url":null,"abstract":"<div><div>In this paper, we consider the two-sample location shift model, a classic semiparametric model introduced by Stein(1956). This model is known for its adaptive nature, enabling nonparametric estimation with full parametric efficiency. Existing nonparametric estimators of the location shift often depend on external tuning parameters, which restricts their practical applicability Vanet al. (1998). We demonstrate that introducing an additional assumption of log-concavity on the underlying density can alleviate the need for tuning parameters. We propose a one step estimator for location shift estimation, utilizing log-concave density estimation techniques to facilitate tuning-free estimation of the efficient influence function. While we use a truncated version of the one step estimator to theoretically demonstrate adaptivity, our simulations indicate that the one step estimators perform best with zero truncation, eliminating the need for tuning during practical implementation. Notably, the efficiency of the truncated one step estimators steadily increases as the truncation level decreases, and those with low levels of truncation exhibit nearly identical empirical performance to the estimator with zero truncation. We apply our method to investigate the location shift in the distribution of Spanish annual household incomes following the 2008 financial crisis.</div></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"238 ","pages":"Article 106272"},"PeriodicalIF":0.8,"publicationDate":"2025-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143150096","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On cross-validated estimation of skew normal model","authors":"Jian Zhang , Tong Wang","doi":"10.1016/j.jspi.2025.106271","DOIUrl":"10.1016/j.jspi.2025.106271","url":null,"abstract":"<div><div>Skew normal model suffers from inferential drawbacks, namely singular Fisher information when it is close to symmetry and diverging of maximum likelihood estimation. This causes a large variation of the conventional maximum likelihood estimate. To address the above drawbacks, Azzalini and Arellano-Valle (2013) introduced maximum penalised likelihood estimation (MPLE) by subtracting a penalty function from the log-likelihood function with a pre-specified penalty coefficient. Here, we propose a cross-validated MPLE to improve its performance when the underlying model is close to symmetry. We develop a theory for MPLE, where an asymptotic rate for the cross-validated penalty coefficient is derived. We further show that the proposed cross-validated MPLE is asymptotically efficient under certain conditions. In simulation studies and a real data application, we demonstrate that the proposed estimator can outperform the conventional MPLE when the model is close to symmetry.</div></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"238 ","pages":"Article 106271"},"PeriodicalIF":0.8,"publicationDate":"2025-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143150094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Model averaging prediction for survival data with time-dependent effects","authors":"Xiaoguang Wang , Rong Hu , Mengyu Li","doi":"10.1016/j.jspi.2024.106260","DOIUrl":"10.1016/j.jspi.2024.106260","url":null,"abstract":"<div><div>It is a fundamental task to predict patients’ survival outcomes in clinical research. As an extension of the Cox proportional hazards model, the time-dependent coefficient Cox model is typically utilized for time-to-event data with time-dependent effects. When the number of covariates is large, the curse of dimensionality emerges for most existing methods. To overcome the limitation and improve predictive performance, a semiparametric model averaging approach is proposed for the time-dependent coefficient Cox model. We introduce a novel criterion to estimate model weights and demonstrate its theoretical properties. Extensive simulation studies are conducted to compare the proposed technique with existing competitive methods. A real clinical data set is also analyzed to illustrate the advantages of our approach.</div></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"238 ","pages":"Article 106260"},"PeriodicalIF":0.8,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143150095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Marginally constrained nonparametric Bayesian inference through Gaussian processes","authors":"Bingjing Tang , Vinayak Rao","doi":"10.1016/j.jspi.2024.106261","DOIUrl":"10.1016/j.jspi.2024.106261","url":null,"abstract":"<div><div>Nonparametric Bayesian models are used routinely as flexible and powerful models of complex data. In many situations, an applied scientist may have additional informative beliefs about the data distribution of interest, for instance, the distribution of its mean or a subset components. This often will not be compatible with the nonparametric prior. An important challenge is then to incorporate this partial prior belief into nonparametric Bayesian models. In this paper, we are motivated by settings where practitioners have additional distributional information about a subset of the coordinates of the observations being modeled. Our approach links this problem to that of conditional density modeling. Our main idea is a novel constrained Bayesian model, based on a perturbation of a parametric distribution with a transformed Gaussian process prior on the perturbation function. We develop a corresponding posterior sampling method based on data augmentation. We illustrate the efficacy of our proposed constrained nonparametric Bayesian model in a variety of real-world scenarios including modeling environmental and earthquake data.</div></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"237 ","pages":"Article 106261"},"PeriodicalIF":0.8,"publicationDate":"2024-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143133631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Liuping Hu , Kashinath Chatterjee , Jianhui Ning , Hong Qin
{"title":"Deterministic construction methods for asymmetrical uniform designs","authors":"Liuping Hu , Kashinath Chatterjee , Jianhui Ning , Hong Qin","doi":"10.1016/j.jspi.2024.106262","DOIUrl":"10.1016/j.jspi.2024.106262","url":null,"abstract":"<div><div>Asymmetrical (mixed-level) uniform designs are useful for both computer and physical experiments. However, constructing these designs is often challenging due to their complex asymmetrical structure. In this paper, we propose novel methods for constructing uniform designs with mixed two-, three-, and four/nine-levels. Our construction methods are deterministic, allowing us to circumvent the complexity associated with stochastic algorithms. We evaluate uniformity using the wrap-around <span><math><msub><mrow><mi>L</mi></mrow><mrow><mn>2</mn></mrow></msub></math></span>- and Lee discrepancies. We establish useful analytic relationships between uniformity and aberration, and derive new general lower bounds for discrepancies that are tighter than those currently available in the literature. These new benchmarks can effectively measure the uniformity of asymmetrical designs. Additionally, we provide examples demonstrating the efficacy of our construction methods and the relevance of the newly obtained lower bounds. Finally, through simulations, we show that the designs produced using our methods perform well in constructing statistical surrogate models.</div></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"237 ","pages":"Article 106262"},"PeriodicalIF":0.8,"publicationDate":"2024-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143133632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Maximum likelihood estimation of short panel autoregressive models with flexible form of fixed effects","authors":"Kazuhiko Hayakawa, Boyan Yin","doi":"10.1016/j.jspi.2024.106252","DOIUrl":"10.1016/j.jspi.2024.106252","url":null,"abstract":"<div><div>This paper proposes the maximum likelihood (ML) estimator for a short panel autoregressive model with a flexible form of observed factors as well as unknown interactive fixed effects. We show that the ML estimator is consistent and asymptotically normally distributed as the number of cross-sectional units increases with the number of time periods being fixed. It should be noted that this asymptotic result holds uniformly for the autoregressive coefficient less than, equal to, or greater than one, in sharp contrast to existing estimators. Monte Carlo simulation results show that the ML estimator has desirable finite sample properties.</div></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"237 ","pages":"Article 106252"},"PeriodicalIF":0.8,"publicationDate":"2024-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143133630","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Outcome dependent subsampling divide and conquer in generalized linear models for massive data","authors":"Jie Yin , Jieli Ding , Changming Yang","doi":"10.1016/j.jspi.2024.106253","DOIUrl":"10.1016/j.jspi.2024.106253","url":null,"abstract":"<div><div>In order to break the constraints and barriers caused by limited computing power in processing massive datasets, we propose an outcome dependent subsampling divide and conquer strategy in this paper. The proposed strategy can process data on multiple blocks in parallel and concentrate the computing resources of each block on regions with the most information. We develop a distributed statistical inference method and propose a computation-efficient algorithm in the generalized linear models for massive data. The proposed method only need to preserve some summary statistics from each data block and then use them to directly construct the proposed estimator. The asymptotic properties of the proposed method are established. Simulation studies and real data analysis are conducted to illustrate the merits of the proposed method.</div></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"237 ","pages":"Article 106253"},"PeriodicalIF":0.8,"publicationDate":"2024-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143133629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Nonparametric estimators of inequality curves and inequality measures","authors":"Alicja Jokiel-Rokita, Sylwester Pia̧tek","doi":"10.1016/j.jspi.2024.106251","DOIUrl":"10.1016/j.jspi.2024.106251","url":null,"abstract":"<div><div>Classical inequality curves and inequality measures are defined for distributions with finite mean value. Moreover, their empirical counterparts are not resistant to outliers. For these reasons, quantile versions of known inequality curves such as the Lorenz, Bonferroni, Zenga and <span><math><mi>D</mi></math></span> curves, and quantile versions of inequality measures such as the Gini, Bonferroni, Zenga and <span><math><mi>D</mi></math></span> indices have been proposed in the literature. We propose various nonparametric estimators of quantile versions of inequality curves and inequality measures, prove their consistency, and compare their accuracy in a simulation study. We also give examples of the use of quantile versions of inequality measures in real data analysis.</div></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"237 ","pages":"Article 106251"},"PeriodicalIF":0.8,"publicationDate":"2024-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143133628","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Estimation and group-feature selection in sparse mixture-of-experts with diverging number of parameters","authors":"Abbas Khalili , Archer Yi Yang , Xiaonan Da","doi":"10.1016/j.jspi.2024.106250","DOIUrl":"10.1016/j.jspi.2024.106250","url":null,"abstract":"<div><div>Mixture-of-experts provide flexible statistical models for a wide range of regression (supervised learning) problems. Often a large number of covariates (features) are available in many modern applications yet only a small subset of them is useful in explaining a response variable of interest. This calls for a feature selection device. In this paper, we present new group-feature selection and estimation methods for sparse mixture-of-experts models when the number of features can be nearly comparable to the sample size. We prove the consistency of the methods in both parameter estimation and feature selection. We implement the methods using a modified EM algorithm combined with proximal gradient method which results in a convenient closed-form parameter update in the M-step of the algorithm. We examine the finite-sample performance of the methods through simulations, and demonstrate their applications in a real data example on exploring relationships in body measurements.</div></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"237 ","pages":"Article 106250"},"PeriodicalIF":0.8,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142705363","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Modeling and testing for endpoint-inflated count time series with bounded support","authors":"Yao Kang , Xiaojing Fan , Jie Zhang , Ying Tang","doi":"10.1016/j.jspi.2024.106248","DOIUrl":"10.1016/j.jspi.2024.106248","url":null,"abstract":"<div><div>Count time series with bounded support frequently exhibit binomial overdispersion, zero inflation and right-endpoint inflation in practical scenarios. Numerous models have been proposed for the analysis of bounded count time series with binomial overdispersion and zero inflation, yet right-endpoint inflation has received comparatively less attention. To better capture these features, this article introduces three versions of extended first-order binomial autoregressive (BAR(1)) models with endpoint inflation. Corresponding stochastic properties of the new models are investigated and model parameters are estimated by the conditional maximum likelihood and quasi-maximum likelihood methods. A binomial right-endpoint inflation index is also constructed and further used to test whether the data set has endpoint-inflated characteristic with respect to a BAR(1) process. Finally, the proposed models are applied to two real data examples. Firstly, we illustrate the usefulness of the proposed models through an application to the voting data on supporting interest rate changes during consecutive monthly meetings of the Monetary Policy Council at the National Bank of Poland. Then, we apply the proposed models to the number of police stations that received at least one drunk driving report per month. The results of the two real data examples indicate that the new models have significant advantages in terms of fitting performance for the bounded count time series with endpoint inflation.</div></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"237 ","pages":"Article 106248"},"PeriodicalIF":0.8,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142759599","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}