Statistics and Computing最新文献

筛选
英文 中文
Efficient estimation and correction of selection-induced bias with order statistics 利用阶次统计有效估计和修正选择诱导偏差
IF 2.2 2区 数学
Statistics and Computing Pub Date : 2024-06-12 DOI: 10.1007/s11222-024-10442-4
Yann McLatchie, Aki Vehtari
{"title":"Efficient estimation and correction of selection-induced bias with order statistics","authors":"Yann McLatchie, Aki Vehtari","doi":"10.1007/s11222-024-10442-4","DOIUrl":"https://doi.org/10.1007/s11222-024-10442-4","url":null,"abstract":"<p>Model selection aims to identify a sufficiently well performing model that is possibly simpler than the most complex model among a pool of candidates. However, the decision-making process itself can inadvertently introduce non-negligible bias when the cross-validation estimates of predictive performance are marred by excessive noise. In finite data regimes, cross-validated estimates can encourage the statistician to select one model over another when it is not actually better for future data. While this bias remains negligible in the case of few models, when the pool of candidates grows, and model selection decisions are compounded (as in step-wise selection), the expected magnitude of selection-induced bias is likely to grow too. This paper introduces an efficient approach to estimate and correct selection-induced bias based on order statistics. Numerical experiments demonstrate the reliability of our approach in estimating both selection-induced bias and over-fitting along compounded model selection decisions, with specific application to forward search. This work represents a light-weight alternative to more computationally expensive approaches to correcting selection-induced bias, such as nested cross-validation and the bootstrap. Our approach rests on several theoretic assumptions, and we provide a diagnostic to help understand when these may not be valid and when to fall back on safer, albeit more computationally expensive approaches. The accompanying code facilitates its practical implementation and fosters further exploration in this area.</p>","PeriodicalId":22058,"journal":{"name":"Statistics and Computing","volume":null,"pages":null},"PeriodicalIF":2.2,"publicationDate":"2024-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141506398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bias-reduced and variance-corrected asymptotic Gaussian inference about extreme expectiles 关于极值期望值的偏差减少和方差校正渐近高斯推理
IF 2.2 2区 数学
Statistics and Computing Pub Date : 2024-06-07 DOI: 10.1007/s11222-023-10359-4
A. Daouia, Gilles Stupfler, Antoine Usseglio‐Carleve
{"title":"Bias-reduced and variance-corrected asymptotic Gaussian inference about extreme expectiles","authors":"A. Daouia, Gilles Stupfler, Antoine Usseglio‐Carleve","doi":"10.1007/s11222-023-10359-4","DOIUrl":"https://doi.org/10.1007/s11222-023-10359-4","url":null,"abstract":"","PeriodicalId":22058,"journal":{"name":"Statistics and Computing","volume":null,"pages":null},"PeriodicalIF":2.2,"publicationDate":"2024-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141370510","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Jittering and clustering: strategies for the construction of robust designs 抖动和聚类:构建稳健设计的策略
IF 2.2 2区 数学
Statistics and Computing Pub Date : 2024-06-04 DOI: 10.1007/s11222-024-10436-2
Douglas P. Wiens
{"title":"Jittering and clustering: strategies for the construction of robust designs","authors":"Douglas P. Wiens","doi":"10.1007/s11222-024-10436-2","DOIUrl":"https://doi.org/10.1007/s11222-024-10436-2","url":null,"abstract":"<p>We discuss, and give examples of, methods for randomly implementing some minimax robust designs from the literature. These have the advantage, over their deterministic counterparts, of having bounded maximum loss in large and very rich neighbourhoods of the, almost certainly inexact, response model fitted by the experimenter. Their maximum loss rivals that of the theoretically best possible, but not implementable, minimax designs. The procedures are then extended to more general robust designs. For two-dimensional designs we sample from contractions of Voronoi tessellations, generated by selected basis points, which partition the design space. These ideas are then extended to <i>k</i>-dimensional designs for general <i>k</i>.</p>","PeriodicalId":22058,"journal":{"name":"Statistics and Computing","volume":null,"pages":null},"PeriodicalIF":2.2,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141259028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Testing the goodness-of-fit of the stable distributions with applications to German stock index data and Bitcoin cryptocurrency data 应用德国股票指数数据和比特币加密货币数据检验稳定分布的拟合优度
IF 2.2 2区 数学
Statistics and Computing Pub Date : 2024-06-03 DOI: 10.1007/s11222-024-10441-5
Ruhul Ali Khan, Ayan Pal, Debasis Kundu
{"title":"Testing the goodness-of-fit of the stable distributions with applications to German stock index data and Bitcoin cryptocurrency data","authors":"Ruhul Ali Khan, Ayan Pal, Debasis Kundu","doi":"10.1007/s11222-024-10441-5","DOIUrl":"https://doi.org/10.1007/s11222-024-10441-5","url":null,"abstract":"<p>Outlier-prone data sets are of immense interest in diverse areas including economics, finance, statistical physics, signal processing, telecommunications and so on. Stable laws (also known as <span>(alpha )</span>- stable laws) are often found to be useful in modeling outlier-prone data containing important information and exhibiting heavy tailed phenomenon. In this article, an asymptotic distribution of a unbiased and consistent estimator of the stability index <span>(alpha )</span> is proposed based on jackknife empirical likelihood (JEL) and adjusted JEL method. Next, using the sum-preserving property of stable random variables and exploiting <i>U</i>-statistic theory, we have developed a goodness-of-fit test procedure for <span>(alpha )</span>-stable distributions where the stability index <span>(alpha )</span> is specified. Extensive simulation studies are performed in order to assess the finite sample performance of the proposed test. Finally, two appealing real life data examples related to the daily closing price of German Stock Index and Bitcoin cryptocurrency are analysed in detail for illustration purposes.</p>","PeriodicalId":22058,"journal":{"name":"Statistics and Computing","volume":null,"pages":null},"PeriodicalIF":2.2,"publicationDate":"2024-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141259103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Insufficient Gibbs sampling 吉布斯采样不足
IF 2.2 2区 数学
Statistics and Computing Pub Date : 2024-05-31 DOI: 10.1007/s11222-024-10423-7
Antoine Luciano, Christian P. Robert, Robin J. Ryder
{"title":"Insufficient Gibbs sampling","authors":"Antoine Luciano, Christian P. Robert, Robin J. Ryder","doi":"10.1007/s11222-024-10423-7","DOIUrl":"https://doi.org/10.1007/s11222-024-10423-7","url":null,"abstract":"<p>In some applied scenarios, the availability of complete data is restricted, often due to privacy concerns; only aggregated, robust and inefficient statistics derived from the data are made accessible. These robust statistics are not sufficient, but they demonstrate reduced sensitivity to outliers and offer enhanced data protection due to their higher breakdown point. We consider a parametric framework and propose a method to sample from the posterior distribution of parameters conditioned on various robust and inefficient statistics: specifically, the pairs (median, MAD) or (median, IQR), or a collection of quantiles. Our approach leverages a Gibbs sampler and simulates latent augmented data, which facilitates simulation from the posterior distribution of parameters belonging to specific families of distributions. A by-product of these samples from the joint posterior distribution of parameters and data given the observed statistics is that we can estimate Bayes factors based on observed statistics via bridge sampling. We validate and outline the limitations of the proposed methods through toy examples and an application to real-world income data.</p>","PeriodicalId":22058,"journal":{"name":"Statistics and Computing","volume":null,"pages":null},"PeriodicalIF":2.2,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141190162","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimization of the generalized covariance estimator in noncausal processes 非因果过程中广义协方差估计器的优化
IF 2.2 2区 数学
Statistics and Computing Pub Date : 2024-05-31 DOI: 10.1007/s11222-024-10437-1
Gianluca Cubadda, Francesco Giancaterini, Alain Hecq, Joann Jasiak
{"title":"Optimization of the generalized covariance estimator in noncausal processes","authors":"Gianluca Cubadda, Francesco Giancaterini, Alain Hecq, Joann Jasiak","doi":"10.1007/s11222-024-10437-1","DOIUrl":"https://doi.org/10.1007/s11222-024-10437-1","url":null,"abstract":"<p>This paper investigates the performance of routinely used optimization algorithms in application to the Generalized Covariance estimator (<i>GCov</i>) for univariate and multivariate mixed causal and noncausal models. The <i>GCov</i> is a semi-parametric estimator with an objective function based on nonlinear autocovariances to identify causal and noncausal orders. When the number and type of nonlinear autocovariances included in the objective function are insufficient/inadequate, or the error density is too close to the Gaussian, identification issues can arise. These issues result in local minima in the objective function, which correspond to parameter values associated with incorrect causal and noncausal orders. Then, depending on the starting point and the optimization algorithm employed, the algorithm can converge to a local minimum. The paper proposes the Simulated Annealing (SA) optimization algorithm as an alternative to conventional numerical optimization methods. The results demonstrate that SA performs well in its application to mixed causal and noncausal models, successfully eliminating the effects of local minima. The proposed approach is illustrated by an empirical study of a bivariate series of commodity prices.</p>","PeriodicalId":22058,"journal":{"name":"Statistics and Computing","volume":null,"pages":null},"PeriodicalIF":2.2,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141190197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A modified EM-type algorithm to estimate semi-parametric mixtures of non-parametric regressions 估计非参数回归半参数混合物的改进型 EM 算法
IF 2.2 2区 数学
Statistics and Computing Pub Date : 2024-05-29 DOI: 10.1007/s11222-024-10435-3
Sphiwe B. Skhosana, Salomon M. Millard, Frans H. J. Kanfer
{"title":"A modified EM-type algorithm to estimate semi-parametric mixtures of non-parametric regressions","authors":"Sphiwe B. Skhosana, Salomon M. Millard, Frans H. J. Kanfer","doi":"10.1007/s11222-024-10435-3","DOIUrl":"https://doi.org/10.1007/s11222-024-10435-3","url":null,"abstract":"<p>Semi-parametric Gaussian mixtures of non-parametric regressions (SPGMNRs) are a flexible extension of Gaussian mixtures of linear regressions (GMLRs). The model assumes that the component regression functions (CRFs) are non-parametric functions of the covariate(s) whereas the component mixing proportions and variances are constants. Unfortunately, the model cannot be reliably estimated using traditional methods. A local-likelihood approach for estimating the CRFs requires that we maximize a set of local-likelihood functions. Using the Expectation-Maximization (EM) algorithm to separately maximize each local-likelihood function may lead to label-switching. This is because the posterior probabilities calculated at the local E-step are not guaranteed to be aligned. The consequence of this label-switching is wiggly and non-smooth estimates of the CRFs. In this paper, we propose a unified approach to address label-switching and obtain sensible estimates. The proposed approach has two stages. In the first stage, we propose a model-based approach to address the label-switching problem. We first note that each local-likelihood function is a likelihood function of a Gaussian mixture model (GMM). Next, we reformulate the SPGMNRs model as a mixture of these GMMs. Lastly, using a modified version of the Expectation Conditional Maximization (ECM) algorithm, we estimate the mixture of GMMs. In addition, using the mixing weights of the local GMMs, we can automatically choose the local points where local-likelihood estimation takes place. In the second stage, we propose one-step backfitting estimates of the parametric and non-parametric terms. The effectiveness of the proposed approach is demonstrated on simulated data and real data analysis.</p>","PeriodicalId":22058,"journal":{"name":"Statistics and Computing","volume":null,"pages":null},"PeriodicalIF":2.2,"publicationDate":"2024-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141166408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Generalized fused Lasso for grouped data in generalized linear models 广义线性模型中分组数据的广义融合拉索(Generalized fused Lasso
IF 2.2 2区 数学
Statistics and Computing Pub Date : 2024-05-25 DOI: 10.1007/s11222-024-10433-5
Mineaki Ohishi
{"title":"Generalized fused Lasso for grouped data in generalized linear models","authors":"Mineaki Ohishi","doi":"10.1007/s11222-024-10433-5","DOIUrl":"https://doi.org/10.1007/s11222-024-10433-5","url":null,"abstract":"<p>Generalized fused Lasso (GFL) is a powerful method based on adjacent relationships or the network structure of data. It is used in a number of research areas, including clustering, discrete smoothing, and spatio-temporal analysis. When applying GFL, the specific optimization method used is an important issue. In generalized linear models, efficient algorithms based on the coordinate descent method have been developed for trend filtering under the binomial and Poisson distributions. However, to apply GFL to other distributions, such as the negative binomial distribution, which is used to deal with overdispersion in the Poisson distribution, or the gamma and inverse Gaussian distributions, which are used for positive continuous data, an algorithm for each individual distribution must be developed. To unify GFL for distributions in the exponential family, this paper proposes a coordinate descent algorithm for generalized linear models. To illustrate the method, a real data example of spatio-temporal analysis is provided.</p>","PeriodicalId":22058,"journal":{"name":"Statistics and Computing","volume":null,"pages":null},"PeriodicalIF":2.2,"publicationDate":"2024-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141153778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Type I Tobit Bayesian Additive Regression Trees for censored outcome regression 用于删减结果回归的 I 类托比特贝叶斯加法回归树
IF 2.2 2区 数学
Statistics and Computing Pub Date : 2024-05-24 DOI: 10.1007/s11222-024-10434-4
Eoghan O’Neill
{"title":"Type I Tobit Bayesian Additive Regression Trees for censored outcome regression","authors":"Eoghan O’Neill","doi":"10.1007/s11222-024-10434-4","DOIUrl":"https://doi.org/10.1007/s11222-024-10434-4","url":null,"abstract":"<p>Censoring occurs when an outcome is unobserved beyond some threshold value. Methods that do not account for censoring produce biased predictions of the unobserved outcome. This paper introduces Type I Tobit Bayesian Additive Regression Tree (TOBART-1) models for censored outcomes. Simulation results and real data applications demonstrate that TOBART-1 produces accurate predictions of censored outcomes. TOBART-1 provides posterior intervals for the conditional expectation and other quantities of interest. The error term distribution can have a large impact on the expectation of the censored outcome. Therefore, the error is flexibly modeled as a Dirichlet process mixture of normal distributions. An R package is available at https://github.com/EoghanONeill/TobitBART.\u0000</p>","PeriodicalId":22058,"journal":{"name":"Statistics and Computing","volume":null,"pages":null},"PeriodicalIF":2.2,"publicationDate":"2024-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141150197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Group sparse structural smoothing recovery: model, statistical properties and algorithm 群体稀疏结构平滑恢复:模型、统计特性和算法
IF 2.2 2区 数学
Statistics and Computing Pub Date : 2024-05-23 DOI: 10.1007/s11222-024-10438-0
Zuoxun Tan, Hu Yang
{"title":"Group sparse structural smoothing recovery: model, statistical properties and algorithm","authors":"Zuoxun Tan, Hu Yang","doi":"10.1007/s11222-024-10438-0","DOIUrl":"https://doi.org/10.1007/s11222-024-10438-0","url":null,"abstract":"","PeriodicalId":22058,"journal":{"name":"Statistics and Computing","volume":null,"pages":null},"PeriodicalIF":2.2,"publicationDate":"2024-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141107806","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信