Computational Statistics & Data Analysis最新文献

筛选
英文 中文
Statistical inference for partially shape-constrained function-on-scalar linear regression models 部分形状约束标量函数线性回归模型的统计推断
IF 1.5 3区 数学
Computational Statistics & Data Analysis Pub Date : 2025-05-12 DOI: 10.1016/j.csda.2025.108200
Kyunghee Han , Yeonjoo Park , Soo-Young Kim
{"title":"Statistical inference for partially shape-constrained function-on-scalar linear regression models","authors":"Kyunghee Han ,&nbsp;Yeonjoo Park ,&nbsp;Soo-Young Kim","doi":"10.1016/j.csda.2025.108200","DOIUrl":"10.1016/j.csda.2025.108200","url":null,"abstract":"<div><div>Functional linear regression models are widely used to link functional/longitudinal outcomes with multiple scalar predictors, identifying time-varying covariate effects through regression coefficient functions. Beyond assessing statistical significance, characterizing the shapes of coefficient functions is crucial for drawing interpretable scientific conclusions. Existing studies on shape-constrained analysis primarily focus on global shapes, which require strict prior knowledge of functional relationships across the entire domain. This often leads to misspecified regression models due to a lack of prior information, making them impractical for real-world applications. To address this, a flexible framework is introduced to identify partial shapes in regression coefficient functions. The proposed partial shape-constrained analysis enables researchers to validate functional shapes within a targeted sub-domain, avoiding the misspecification of shape constraints outside the sub-domain of interest. The method also allows for testing different sub-domains for individual covariates and multiple partial shape constraints across composite sub-domains. Our framework supports both kernel- and spline-based estimation approaches, ensuring robust performance with flexibility in computational preference. Finite-sample experiments across various scenarios demonstrate that the proposed framework significantly outperforms the application of global shape constraints to partial domains in both estimation and inference procedures. The inferential tool particularly maintains the type I error rate at the nominal significance level and exhibits increasing power with larger sample sizes, confirming the consistency of the test procedure. The practicality of partial shape-constrained inference is demonstrated through two applications: a clinical trial on NeuroBloc for type A-resistant cervical dystonia and the National Institute of Mental Health Schizophrenia Study.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"211 ","pages":"Article 108200"},"PeriodicalIF":1.5,"publicationDate":"2025-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144083910","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Distributed variable screening for generalized linear models 广义线性模型的分布变量筛选
IF 1.5 3区 数学
Computational Statistics & Data Analysis Pub Date : 2025-05-12 DOI: 10.1016/j.csda.2025.108203
Tianbo Diao , Bo Li , Lianqiang Qu , Liuquan Sun
{"title":"Distributed variable screening for generalized linear models","authors":"Tianbo Diao ,&nbsp;Bo Li ,&nbsp;Lianqiang Qu ,&nbsp;Liuquan Sun","doi":"10.1016/j.csda.2025.108203","DOIUrl":"10.1016/j.csda.2025.108203","url":null,"abstract":"<div><div>In this article, we develop a distributed variable screening method for generalized linear models. This method is designed to handle situations where both the sample size and the number of covariates are large. Specifically, the proposed method selects relevant covariates by using a sparsity-restricted surrogate likelihood estimator. It takes into account the joint effects of the covariates rather than just the marginal effect, and this characteristic enhances the reliability of the screening results. We establish the sure screening property of the proposed method, which ensures that with a high probability, the true model is included in the selected model. Simulation studies are conducted to evaluate the finite sample performance of the proposed method, and an application to a real dataset showcases its practical utility.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"211 ","pages":"Article 108203"},"PeriodicalIF":1.5,"publicationDate":"2025-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143942607","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Quantile Super Learning for independent and online settings with application to solar power forecasting 分位数超级学习独立和在线设置应用于太阳能发电预测
IF 1.5 3区 数学
Computational Statistics & Data Analysis Pub Date : 2025-05-09 DOI: 10.1016/j.csda.2025.108202
Herbert Susmann , Antoine Chambaz
{"title":"Quantile Super Learning for independent and online settings with application to solar power forecasting","authors":"Herbert Susmann ,&nbsp;Antoine Chambaz","doi":"10.1016/j.csda.2025.108202","DOIUrl":"10.1016/j.csda.2025.108202","url":null,"abstract":"<div><div>Estimating quantiles of an outcome conditional on covariates is of fundamental interest in statistics with broad application in probabilistic prediction and forecasting. An ensemble method for conditional quantile estimation is proposed, Quantile Super Learning, that combines predictions from multiple candidate algorithms based on their empirical performance measured with respect to a cross-validated empirical risk of the quantile loss function. Theoretical guarantees for both i.i.d. and online data scenarios are presented. The performance of <em>this</em> approach for quantile estimation and in forming prediction intervals is tested in simulation studies. Two case studies related to solar energy are used to illustrate Quantile Super Learning: in an i.i.d. setting, we predict the physical properties of perovskite materials for photovoltaic cells, and in an online setting we forecast ground solar irradiance based on output from dynamic weather ensemble models.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"211 ","pages":"Article 108202"},"PeriodicalIF":1.5,"publicationDate":"2025-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143942605","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Monotone composite quantile regression neural network for censored data with a cure fraction 带固定分数的删减数据的单调复合分位数回归神经网络
IF 1.5 3区 数学
Computational Statistics & Data Analysis Pub Date : 2025-05-08 DOI: 10.1016/j.csda.2025.108201
Xinran Zhang , Xiaohui Yuan , Chunjie Wang , Xinyuan Song
{"title":"Monotone composite quantile regression neural network for censored data with a cure fraction","authors":"Xinran Zhang ,&nbsp;Xiaohui Yuan ,&nbsp;Chunjie Wang ,&nbsp;Xinyuan Song","doi":"10.1016/j.csda.2025.108201","DOIUrl":"10.1016/j.csda.2025.108201","url":null,"abstract":"<div><div>The cure rate monotone composite quantile regression neural network model is investigated as an extension of the cure rate quantile model. It can uncover complex nonlinear relationships and effectively ensure the non-crossing of quantile predictions. An iterative algorithm coupled with data augmentation is developed to predict the survival time of susceptible subjects and the cure rate among all subjects. Simulation studies indicate that the proposed approach exhibits advantages in prediction over traditional statistical methods in finite samples when nonlinearity exists between response and predictors. The analysis of two real datasets further validates the utility of the proposed method.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"211 ","pages":"Article 108201"},"PeriodicalIF":1.5,"publicationDate":"2025-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143935576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Latent-class trajectory modeling with a heterogeneous mean-variance relation 基于异构均值-方差关系的潜类轨迹建模
IF 1.5 3区 数学
Computational Statistics & Data Analysis Pub Date : 2025-05-02 DOI: 10.1016/j.csda.2025.108199
Niek G.P. Den Teuling , Francesco Ungolo , Steffen C. Pauws , Edwin R. van den Heuvel
{"title":"Latent-class trajectory modeling with a heterogeneous mean-variance relation","authors":"Niek G.P. Den Teuling ,&nbsp;Francesco Ungolo ,&nbsp;Steffen C. Pauws ,&nbsp;Edwin R. van den Heuvel","doi":"10.1016/j.csda.2025.108199","DOIUrl":"10.1016/j.csda.2025.108199","url":null,"abstract":"<div><div>The benefit of addressing heteroskedastic residual variances across trajectories is investigated with the purpose of finding clusters of longitudinal trajectories. Models are proposed to account for class-specific heteroskedasticity through a mean-variance relation or random residual variance, thereby accounting for trajectory-specific variance. The analyzed latent-class trajectory models are an extension of growth mixture models (GMM). The estimation bias of the model parameters and the recoverability of the number of latent classes are assessed under various data-generating models and settings by means of a simulation study. Furthermore, the empirical applicability of these models is demonstrated through the analysis of the time-varying incidence rate of COVID-19 cases across counties in the United States. Overall, the class-specific mean-variance could be reliably estimated by the proposed models in datasets comprising 250 trajectories. In addition, the extended GMM accounting for the residual random variance showed improved group trajectory estimation over the standard GMM.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"210 ","pages":"Article 108199"},"PeriodicalIF":1.5,"publicationDate":"2025-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143904339","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A goodness-of-fit test for geometric Brownian motion 几何布朗运动的拟合优度检验
IF 1.5 3区 数学
Computational Statistics & Data Analysis Pub Date : 2025-04-23 DOI: 10.1016/j.csda.2025.108196
Daniel Gaigall , Philipp Wübbolding
{"title":"A goodness-of-fit test for geometric Brownian motion","authors":"Daniel Gaigall ,&nbsp;Philipp Wübbolding","doi":"10.1016/j.csda.2025.108196","DOIUrl":"10.1016/j.csda.2025.108196","url":null,"abstract":"<div><div>A new goodness-of-fit test for the composite null hypothesis that data originate from a geometric Brownian motion is studied in the functional data setting. This is equivalent to testing if the data are from a scaled Brownian motion with linear drift. Critical values for the test are obtained, ensuring that the specified significance level is achieved in finite samples. The asymptotic behavior of the test statistic under the null distribution and alternatives is studied, and it is also demonstrated that the test is consistent. Furthermore, the proposed approach offers advantages in terms of fast and simple implementation. A comprehensive simulation study shows that the power of the new test compares favorably to that of existing methods. A key application is the assessment of financial time series for the suitability of the Black-Scholes model. Examples relating to various stock and interest rate time series are presented in order to illustrate the proposed test.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"210 ","pages":"Article 108196"},"PeriodicalIF":1.5,"publicationDate":"2025-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143868689","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A simultaneous confidence-bounded true discovery proportion perspective on localizing differences in smooth terms in regression models 回归模型中平滑项的局部化差异的同步置信度有界真发现比例视角
IF 1.5 3区 数学
Computational Statistics & Data Analysis Pub Date : 2025-04-23 DOI: 10.1016/j.csda.2025.108197
David Swanson
{"title":"A simultaneous confidence-bounded true discovery proportion perspective on localizing differences in smooth terms in regression models","authors":"David Swanson","doi":"10.1016/j.csda.2025.108197","DOIUrl":"10.1016/j.csda.2025.108197","url":null,"abstract":"<div><div>A method is demonstrated for localizing where two spline terms, or smooths, differ using a true discovery proportion (TDP)-based interpretation. The procedure yields a statement on the proportion of some region where true differences exist between two smooths. The methodology avoids ad hoc approaches to making such statements, like subsetting the data and performing hypothesis tests on the truncated spline terms. TDP estimates are 1-<em>α</em> confidence-bounded simultaneously, which means that a region's TDP estimate is a lower bound on the proportion of actual differences, or true discoveries, in that region, with high confidence regardless of the number of estimates made. The procedure is based on closed-testing using Simes local test. This local test requires that the multivariate <span><math><msup><mrow><mi>χ</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span> test statistics of generalized Wishart type underlying the method be positive regression dependent on subsets (PRDS), a result for which evidence is presented suggesting that the condition holds. Consistency of the procedure is demonstrated for generalized additive models with the tuning parameter chosen by REML or GCV, and the achievement of confidence-bounded TDP is shown in simulation as is an analysis of walking gait.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"211 ","pages":"Article 108197"},"PeriodicalIF":1.5,"publicationDate":"2025-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143906892","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Co-clustering multi-view data using the Latent Block Model 使用潜在块模型的多视图数据共聚类
IF 1.5 3区 数学
Computational Statistics & Data Analysis Pub Date : 2025-04-10 DOI: 10.1016/j.csda.2025.108188
Joshua Tobin , Michaela Black , James Ng , Debbie Rankin , Jonathan Wallace , Catherine Hughes , Leane Hoey , Adrian Moore , Jinling Wang , Geraldine Horigan , Paul Carlin , Helene McNulty , Anne M. Molloy , Mimi Zhang
{"title":"Co-clustering multi-view data using the Latent Block Model","authors":"Joshua Tobin ,&nbsp;Michaela Black ,&nbsp;James Ng ,&nbsp;Debbie Rankin ,&nbsp;Jonathan Wallace ,&nbsp;Catherine Hughes ,&nbsp;Leane Hoey ,&nbsp;Adrian Moore ,&nbsp;Jinling Wang ,&nbsp;Geraldine Horigan ,&nbsp;Paul Carlin ,&nbsp;Helene McNulty ,&nbsp;Anne M. Molloy ,&nbsp;Mimi Zhang","doi":"10.1016/j.csda.2025.108188","DOIUrl":"10.1016/j.csda.2025.108188","url":null,"abstract":"<div><div>The Latent Block Model (LBM) is a prominent model-based co-clustering method, returning parametric representations of each block-cluster and allowing the use of well-grounded model selection methods. Although the LBM has been adapted to accommodate various feature types, it cannot be applied to datasets consisting of multiple distinct sets of features, termed views, for a common set of observations. The multi-view LBM is introduced herein, extending the LBM method to multi-view data, where each view marginally follows an LBM. For any pair of two views, the dependence between them is captured by a row-cluster membership matrix. A likelihood-based approach is formulated for parameter estimation, harnessing a stochastic EM algorithm merged with a Gibbs sampler, while an ICL criterion is formulated to determine the number of row- and column-clusters in each view. To justify the application of the multi-view approach, hypothesis tests are formulated to evaluate the independence of row-clusters across views, with the testing procedure seamlessly integrated into the estimation framework. A penalty scheme is also introduced to induce sparsity in row-clusterings. The algorithm's performance is validated using synthetic and real-world datasets, accompanied by recommendations for optimal parameter selection. Finally, the multi-view co-clustering method is applied to a complex genomics dataset, and is shown to provide new insights for high-dimension multi-view problems.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"210 ","pages":"Article 108188"},"PeriodicalIF":1.5,"publicationDate":"2025-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143868688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Non-parametric tests for cross-dependence based on multivariate extensions of ordinal patterns 基于有序模式多元扩展的交叉依赖非参数检验
IF 1.5 3区 数学
Computational Statistics & Data Analysis Pub Date : 2025-04-10 DOI: 10.1016/j.csda.2025.108189
Angelika Silbernagel , Christian H. Weiß , Alexander Schnurr
{"title":"Non-parametric tests for cross-dependence based on multivariate extensions of ordinal patterns","authors":"Angelika Silbernagel ,&nbsp;Christian H. Weiß ,&nbsp;Alexander Schnurr","doi":"10.1016/j.csda.2025.108189","DOIUrl":"10.1016/j.csda.2025.108189","url":null,"abstract":"<div><div>Analyzing the cross-dependence within sequentially observed pairs of random variables is an interesting mathematical problem that also has several practical applications. Most of the time, classical dependence measures like Pearson's correlation are used to this end. This quantity, however, only measures linear dependence and has other drawbacks as well. Different concepts for measuring cross-dependence in sequentially observed random vectors, which are based on so-called ordinal patterns or multivariate generalizations of them, are described. In all cases, limiting distributions of the corresponding test statistics are derived. In a simulation study, the performance of these statistics is compared with three competitors, namely, classical Pearson's and Spearman's correlation as well as the rank-based Chatterjee's correlation coefficient. The applicability of the test statistics is illustrated by using them on two real-world data examples.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"210 ","pages":"Article 108189"},"PeriodicalIF":1.5,"publicationDate":"2025-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143814833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A flexible mixed-membership model for community and enterotype detection for microbiome data 一种灵活的混合成员模型,用于微生物组数据的社区和肠型检测
IF 1.5 3区 数学
Computational Statistics & Data Analysis Pub Date : 2025-04-04 DOI: 10.1016/j.csda.2025.108181
Alice Giampino, Roberto Ascari, Sonia Migliorati
{"title":"A flexible mixed-membership model for community and enterotype detection for microbiome data","authors":"Alice Giampino,&nbsp;Roberto Ascari,&nbsp;Sonia Migliorati","doi":"10.1016/j.csda.2025.108181","DOIUrl":"10.1016/j.csda.2025.108181","url":null,"abstract":"<div><div>Understanding how the human gut microbiome affects host health is challenging due to the wide interindividual variability, sparsity, and high dimensionality of microbiome data. Mixed-membership models have been previously applied to these data to detect latent communities of bacterial taxa that are expected to co-occur. The most widely used mixed-membership model is latent Dirichlet allocation (LDA). However, LDA is limited by the rigidity of the Dirichlet distribution imposed on the community proportions, which hinders its ability to model dependencies and account for overdispersion. To address this limitation, a generalization of LDA is proposed that introduces greater flexibility into the covariance matrix by incorporating the flexible Dirichlet (FD), a specific identifiable mixture with Dirichlet components. In addition to identifying communities, the new model enables the detection of enterotypes, i.e., clusters of samples with similar microbe composition. For inferential purposes, a computationally efficient collapsed Gibbs sampler that exploits the conjugacy of the FD distribution with respect to the multinomial model is proposed. A simulation study demonstrates the model's ability to accurately recover true parameter values by minimizing appropriate compositional discrepancy measures between the true and estimated values. Additionally, the model correctly identifies the number of communities, as evidenced by perplexity scores. Moreover, an application to the COMBO dataset highlights its effectiveness in detecting biologically significant and coherent communities and enterotypes, revealing a broader range of correlations between community abundances. These results underscore the new model as a definite improvement over LDA.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"210 ","pages":"Article 108181"},"PeriodicalIF":1.5,"publicationDate":"2025-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143792034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信