Computational Statistics & Data Analysis最新文献_第3页

Statistical inference for partially shape-constrained function-on-scalar linear regression models 部分形状约束标量函数线性回归模型的统计推断

IF 1.5 3区数学

Computational Statistics & Data Analysis Pub Date : 2025-05-12 DOI: 10.1016/j.csda.2025.108200

Kyunghee Han , Yeonjoo Park , Soo-Young Kim

{"title":"Statistical inference for partially shape-constrained function-on-scalar linear regression models","authors":"Kyunghee Han , Yeonjoo Park , Soo-Young Kim","doi":"10.1016/j.csda.2025.108200","DOIUrl":"10.1016/j.csda.2025.108200","url":null,"abstract":"<div><div>Functional linear regression models are widely used to link functional/longitudinal outcomes with multiple scalar predictors, identifying time-varying covariate effects through regression coefficient functions. Beyond assessing statistical significance, characterizing the shapes of coefficient functions is crucial for drawing interpretable scientific conclusions. Existing studies on shape-constrained analysis primarily focus on global shapes, which require strict prior knowledge of functional relationships across the entire domain. This often leads to misspecified regression models due to a lack of prior information, making them impractical for real-world applications. To address this, a flexible framework is introduced to identify partial shapes in regression coefficient functions. The proposed partial shape-constrained analysis enables researchers to validate functional shapes within a targeted sub-domain, avoiding the misspecification of shape constraints outside the sub-domain of interest. The method also allows for testing different sub-domains for individual covariates and multiple partial shape constraints across composite sub-domains. Our framework supports both kernel- and spline-based estimation approaches, ensuring robust performance with flexibility in computational preference. Finite-sample experiments across various scenarios demonstrate that the proposed framework significantly outperforms the application of global shape constraints to partial domains in both estimation and inference procedures. The inferential tool particularly maintains the type I error rate at the nominal significance level and exhibits increasing power with larger sample sizes, confirming the consistency of the test procedure. The practicality of partial shape-constrained inference is demonstrated through two applications: a clinical trial on NeuroBloc for type A-resistant cervical dystonia and the National Institute of Mental Health Schizophrenia Study.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"211 ","pages":"Article 108200"},"PeriodicalIF":1.5,"publicationDate":"2025-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144083910","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Distributed variable screening for generalized linear models 广义线性模型的分布变量筛选

IF 1.5 3区数学

Computational Statistics & Data Analysis Pub Date : 2025-05-12 DOI: 10.1016/j.csda.2025.108203

Tianbo Diao , Bo Li , Lianqiang Qu , Liuquan Sun

引用次数: 0

Quantile Super Learning for independent and online settings with application to solar power forecasting 分位数超级学习独立和在线设置应用于太阳能发电预测

IF 1.5 3区数学

Computational Statistics & Data Analysis Pub Date : 2025-05-09 DOI: 10.1016/j.csda.2025.108202

Herbert Susmann , Antoine Chambaz

引用次数: 0

Monotone composite quantile regression neural network for censored data with a cure fraction 带固定分数的删减数据的单调复合分位数回归神经网络

IF 1.5 3区数学

Computational Statistics & Data Analysis Pub Date : 2025-05-08 DOI: 10.1016/j.csda.2025.108201

Xinran Zhang , Xiaohui Yuan , Chunjie Wang , Xinyuan Song

引用次数: 0

Latent-class trajectory modeling with a heterogeneous mean-variance relation 基于异构均值-方差关系的潜类轨迹建模

IF 1.5 3区数学

Computational Statistics & Data Analysis Pub Date : 2025-05-02 DOI: 10.1016/j.csda.2025.108199

Niek G.P. Den Teuling , Francesco Ungolo , Steffen C. Pauws , Edwin R. van den Heuvel

{"title":"Latent-class trajectory modeling with a heterogeneous mean-variance relation","authors":"Niek G.P. Den Teuling , Francesco Ungolo , Steffen C. Pauws , Edwin R. van den Heuvel","doi":"10.1016/j.csda.2025.108199","DOIUrl":"10.1016/j.csda.2025.108199","url":null,"abstract":"<div><div>The benefit of addressing heteroskedastic residual variances across trajectories is investigated with the purpose of finding clusters of longitudinal trajectories. Models are proposed to account for class-specific heteroskedasticity through a mean-variance relation or random residual variance, thereby accounting for trajectory-specific variance. The analyzed latent-class trajectory models are an extension of growth mixture models (GMM). The estimation bias of the model parameters and the recoverability of the number of latent classes are assessed under various data-generating models and settings by means of a simulation study. Furthermore, the empirical applicability of these models is demonstrated through the analysis of the time-varying incidence rate of COVID-19 cases across counties in the United States. Overall, the class-specific mean-variance could be reliably estimated by the proposed models in datasets comprising 250 trajectories. In addition, the extended GMM accounting for the residual random variance showed improved group trajectory estimation over the standard GMM.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"210 ","pages":"Article 108199"},"PeriodicalIF":1.5,"publicationDate":"2025-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143904339","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A goodness-of-fit test for geometric Brownian motion 几何布朗运动的拟合优度检验

IF 1.5 3区数学

Computational Statistics & Data Analysis Pub Date : 2025-04-23 DOI: 10.1016/j.csda.2025.108196

Daniel Gaigall , Philipp Wübbolding

引用次数: 0

A simultaneous confidence-bounded true discovery proportion perspective on localizing differences in smooth terms in regression models 回归模型中平滑项的局部化差异的同步置信度有界真发现比例视角

IF 1.5 3区数学

Computational Statistics & Data Analysis Pub Date : 2025-04-23 DOI: 10.1016/j.csda.2025.108197

David Swanson

{"title":"A simultaneous confidence-bounded true discovery proportion perspective on localizing differences in smooth terms in regression models","authors":"David Swanson","doi":"10.1016/j.csda.2025.108197","DOIUrl":"10.1016/j.csda.2025.108197","url":null,"abstract":"<div><div>A method is demonstrated for localizing where two spline terms, or smooths, differ using a true discovery proportion (TDP)-based interpretation. The procedure yields a statement on the proportion of some region where true differences exist between two smooths. The methodology avoids ad hoc approaches to making such statements, like subsetting the data and performing hypothesis tests on the truncated spline terms. TDP estimates are 1-<em>α</em> confidence-bounded simultaneously, which means that a region's TDP estimate is a lower bound on the proportion of actual differences, or true discoveries, in that region, with high confidence regardless of the number of estimates made. The procedure is based on closed-testing using Simes local test. This local test requires that the multivariate <span><math><msup><mrow><mi>χ</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span> test statistics of generalized Wishart type underlying the method be positive regression dependent on subsets (PRDS), a result for which evidence is presented suggesting that the condition holds. Consistency of the procedure is demonstrated for generalized additive models with the tuning parameter chosen by REML or GCV, and the achievement of confidence-bounded TDP is shown in simulation as is an analysis of walking gait.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"211 ","pages":"Article 108197"},"PeriodicalIF":1.5,"publicationDate":"2025-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143906892","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Co-clustering multi-view data using the Latent Block Model 使用潜在块模型的多视图数据共聚类

IF 1.5 3区数学

Computational Statistics & Data Analysis Pub Date : 2025-04-10 DOI: 10.1016/j.csda.2025.108188

Joshua Tobin , Michaela Black , James Ng , Debbie Rankin , Jonathan Wallace , Catherine Hughes , Leane Hoey , Adrian Moore , Jinling Wang , Geraldine Horigan , Paul Carlin , Helene McNulty , Anne M. Molloy , Mimi Zhang

{"title":"Co-clustering multi-view data using the Latent Block Model","authors":"Joshua Tobin , Michaela Black , James Ng , Debbie Rankin , Jonathan Wallace , Catherine Hughes , Leane Hoey , Adrian Moore , Jinling Wang , Geraldine Horigan , Paul Carlin , Helene McNulty , Anne M. Molloy , Mimi Zhang","doi":"10.1016/j.csda.2025.108188","DOIUrl":"10.1016/j.csda.2025.108188","url":null,"abstract":"<div><div>The Latent Block Model (LBM) is a prominent model-based co-clustering method, returning parametric representations of each block-cluster and allowing the use of well-grounded model selection methods. Although the LBM has been adapted to accommodate various feature types, it cannot be applied to datasets consisting of multiple distinct sets of features, termed views, for a common set of observations. The multi-view LBM is introduced herein, extending the LBM method to multi-view data, where each view marginally follows an LBM. For any pair of two views, the dependence between them is captured by a row-cluster membership matrix. A likelihood-based approach is formulated for parameter estimation, harnessing a stochastic EM algorithm merged with a Gibbs sampler, while an ICL criterion is formulated to determine the number of row- and column-clusters in each view. To justify the application of the multi-view approach, hypothesis tests are formulated to evaluate the independence of row-clusters across views, with the testing procedure seamlessly integrated into the estimation framework. A penalty scheme is also introduced to induce sparsity in row-clusterings. The algorithm's performance is validated using synthetic and real-world datasets, accompanied by recommendations for optimal parameter selection. Finally, the multi-view co-clustering method is applied to a complex genomics dataset, and is shown to provide new insights for high-dimension multi-view problems.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"210 ","pages":"Article 108188"},"PeriodicalIF":1.5,"publicationDate":"2025-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143868688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Non-parametric tests for cross-dependence based on multivariate extensions of ordinal patterns 基于有序模式多元扩展的交叉依赖非参数检验

IF 1.5 3区数学

Computational Statistics & Data Analysis Pub Date : 2025-04-10 DOI: 10.1016/j.csda.2025.108189

Angelika Silbernagel , Christian H. Weiß , Alexander Schnurr

引用次数: 0

A flexible mixed-membership model for community and enterotype detection for microbiome data 一种灵活的混合成员模型，用于微生物组数据的社区和肠型检测

IF 1.5 3区数学

Computational Statistics & Data Analysis Pub Date : 2025-04-04 DOI: 10.1016/j.csda.2025.108181

Alice Giampino, Roberto Ascari, Sonia Migliorati

{"title":"A flexible mixed-membership model for community and enterotype detection for microbiome data","authors":"Alice Giampino, Roberto Ascari, Sonia Migliorati","doi":"10.1016/j.csda.2025.108181","DOIUrl":"10.1016/j.csda.2025.108181","url":null,"abstract":"<div><div>Understanding how the human gut microbiome affects host health is challenging due to the wide interindividual variability, sparsity, and high dimensionality of microbiome data. Mixed-membership models have been previously applied to these data to detect latent communities of bacterial taxa that are expected to co-occur. The most widely used mixed-membership model is latent Dirichlet allocation (LDA). However, LDA is limited by the rigidity of the Dirichlet distribution imposed on the community proportions, which hinders its ability to model dependencies and account for overdispersion. To address this limitation, a generalization of LDA is proposed that introduces greater flexibility into the covariance matrix by incorporating the flexible Dirichlet (FD), a specific identifiable mixture with Dirichlet components. In addition to identifying communities, the new model enables the detection of enterotypes, i.e., clusters of samples with similar microbe composition. For inferential purposes, a computationally efficient collapsed Gibbs sampler that exploits the conjugacy of the FD distribution with respect to the multinomial model is proposed. A simulation study demonstrates the model's ability to accurately recover true parameter values by minimizing appropriate compositional discrepancy measures between the true and estimated values. Additionally, the model correctly identifies the number of communities, as evidenced by perplexity scores. Moreover, an application to the COMBO dataset highlights its effectiveness in detecting biologically significant and coherent communities and enterotypes, revealing a broader range of correlations between community abundances. These results underscore the new model as a definite improvement over LDA.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"210 ","pages":"Article 108181"},"PeriodicalIF":1.5,"publicationDate":"2025-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143792034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0