Computational Statistics & Data Analysis最新文献_第8页

Fusion regression methods with repeated functional data 重复功能数据的融合回归方法

IF 1.5 3区数学

Computational Statistics & Data Analysis Pub Date : 2024-09-27 DOI: 10.1016/j.csda.2024.108069

Issam-Ali Moindjié , Cristian Preda , Sophie Dabo-Niang

引用次数: 0

A dual-penalized approach to hypothesis testing in high-dimensional linear mediation models 高维线性中介模型假设检验的双重惩罚方法

IF 1.5 3区数学

Computational Statistics & Data Analysis Pub Date : 2024-09-24 DOI: 10.1016/j.csda.2024.108064

Chenxuan He , Yiran He , Wangli Xu

引用次数: 0

A tree approach for variable selection and its random forest 变量选择树方法及其随机森林

IF 1.5 3区数学

Computational Statistics & Data Analysis Pub Date : 2024-09-18 DOI: 10.1016/j.csda.2024.108068

Yu Liu , Xu Qin , Zhibo Cai

{"title":"A tree approach for variable selection and its random forest","authors":"Yu Liu , Xu Qin , Zhibo Cai","doi":"10.1016/j.csda.2024.108068","DOIUrl":"10.1016/j.csda.2024.108068","url":null,"abstract":"<div><div>The Sure Independence Screening (SIS) provides a fast and efficient ranking for the importance of variables for ultra-high dimensional regressions. However, classical SIS cannot eliminate false importance in the ranking, which is exacerbated in nonparametric settings. To address this problem, a novel screening approach is proposed by partitioning the sample into subsets sequentially and creating a tree-like structure of sub-samples called SIS-tree. SIS-tree is straightforward to implement and can be integrated with various measures of dependence. Theoretical results are established to support this approach, including its “sure screening property”. Additionally, SIS-tree is extended to a forest with improved performance. Through simulations, the proposed methods are demonstrated to have great improvement comparing with existing SIS methods. The selection of a cutoff for the screening is also investigated through theoretical justification and experimental study. As a direct application, classifications of high-dimensional data are considered, and it is found that the screening and cutoff can substantially improve the performance of existing classifiers. The proposed approaches can be implemented using R package “SIStree” at <span><span>https://github.com/liuyu-star/SIStree</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"202 ","pages":"Article 108068"},"PeriodicalIF":1.5,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142311329","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Online graph topology learning from matrix-valued time series 从矩阵值时间序列在线图拓扑学习

IF 1.5 3区数学

Computational Statistics & Data Analysis Pub Date : 2024-09-16 DOI: 10.1016/j.csda.2024.108065

Yiye Jiang , Jérémie Bigot , Sofian Maabout

{"title":"Online graph topology learning from matrix-valued time series","authors":"Yiye Jiang , Jérémie Bigot , Sofian Maabout","doi":"10.1016/j.csda.2024.108065","DOIUrl":"10.1016/j.csda.2024.108065","url":null,"abstract":"<div><p>The focus is on the statistical analysis of matrix-valued time series, where data is collected over a network of sensors, typically at spatial locations, over time. Each sensor records a vector of features at each time point, creating a vectorial time series for each sensor. The goal is to identify the dependency structure among these sensors and represent it with a graph. When only one feature per sensor is observed, vector auto-regressive (VAR) models are commonly used to infer Granger causality, resulting in a causal graph. The first contribution extends VAR models to matrix-variate models for the purpose of graph learning. Additionally, two online procedures are proposed for both low and high dimensions, enabling rapid updates of coefficient estimates as new samples arrive. In the high-dimensional setting, a novel Lasso-type approach is introduced, and homotopy algorithms are developed for online learning. An adaptive tuning procedure for the regularization parameter is also provided. Given that the application of auto-regressive models to data typically requires detrending, which is not feasible in an online context, the proposed AR models are augmented by incorporating trend as an additional parameter, with a particular focus on periodic trends. The online algorithms are adapted to these augmented data models, allowing for simultaneous learning of the graph and trend from streaming samples. Numerical experiments using both synthetic and real data demonstrate the effectiveness of the proposed methods.</p></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"202 ","pages":"Article 108065"},"PeriodicalIF":1.5,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142274158","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A variational inference framework for inverse problems 逆问题的变分推理框架

IF 1.5 3区数学

Computational Statistics & Data Analysis Pub Date : 2024-09-16 DOI: 10.1016/j.csda.2024.108055

Luca Maestrini , Robert G. Aykroyd , Matt P. Wand

{"title":"A variational inference framework for inverse problems","authors":"Luca Maestrini , Robert G. Aykroyd , Matt P. Wand","doi":"10.1016/j.csda.2024.108055","DOIUrl":"10.1016/j.csda.2024.108055","url":null,"abstract":"<div><div>A framework is presented for fitting inverse problem models via variational Bayes approximations. This methodology guarantees flexibility to statistical model specification for a broad range of applications, good accuracy and reduced model fitting times. The message passing and factor graph fragment approach to variational Bayes that is also described facilitates streamlined implementation of approximate inference algorithms and allows for supple inclusion of numerous response distributions and penalizations into the inverse problem model. Models for one- and two-dimensional response variables are examined and an infrastructure is laid down where efficient algorithm updates based on nullifying weak interactions between variables can also be derived for inverse problems in higher dimensions. An image processing application and a simulation exercise motivated by biomedical problems reveal the computational advantage offered by efficient implementation of variational Bayes over Markov chain Monte Carlo.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"202 ","pages":"Article 108055"},"PeriodicalIF":1.5,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0167947324001397/pdfft?md5=85a537d37759205b0ecbf4270e7221f7&pid=1-s2.0-S0167947324001397-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142311328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Spline regression with automatic knot selection 带有自动结点选择功能的样条回归

IF 1.5 3区数学

Computational Statistics & Data Analysis Pub Date : 2024-09-16 DOI: 10.1016/j.csda.2024.108043

Vivien Goepp , Olivier Bouaziz , Grégory Nuel

引用次数: 0

Beta-CoRM: A Bayesian approach for n-gram profiles analysis Beta-CoRM：用于 n-gram 剖面分析的贝叶斯方法

IF 1.5 3区数学

Computational Statistics & Data Analysis Pub Date : 2024-09-10 DOI: 10.1016/j.csda.2024.108056

José A. Perusquía , Jim E. Griffin , Cristiano Villa

引用次数: 0

Minimum profile Hellinger distance estimation of general covariate models 一般协变量模型的最小剖面海灵格距离估计

IF 1.5 3区数学

Computational Statistics & Data Analysis Pub Date : 2024-08-30 DOI: 10.1016/j.csda.2024.108054

Bowei Ding , Rohana J. Karunamuni , Jingjing Wu

{"title":"Minimum profile Hellinger distance estimation of general covariate models","authors":"Bowei Ding , Rohana J. Karunamuni , Jingjing Wu","doi":"10.1016/j.csda.2024.108054","DOIUrl":"10.1016/j.csda.2024.108054","url":null,"abstract":"<div><p>Covariate models, such as polynomial regression models, generalized linear models, and heteroscedastic models, are widely used in statistical applications. The importance of such models in statistical analysis is abundantly clear by the ever-increasing rate at which articles on covariate models are appearing in the statistical literature. Because of their flexibility, covariate models are increasingly being exploited as a convenient way to model data that consist of both a response variable and one or more covariate variables that affect the outcome of the response variable. Efficient and robust estimates for broadly defined semiparametric covariate models are investigated, and for this purpose the minimum distance approach is employed. In general, minimum distance estimators are automatically robust with respect to the stability of the quantity being estimated. In particular, minimum Hellinger distance estimation for parametric models produces estimators that are asymptotically efficient at the model density and simultaneously possess excellent robustness properties. For semiparametric covariate models, the minimum Hellinger distance method is extended and a minimum profile Hellinger distance estimator is proposed. Its asymptotic properties such as consistency are studied, and its finite-sample performance and robustness are examined by using Monte Carlo simulations and three real data analyses. Additionally, a computing algorithm is developed to ease the computation of the estimator.</p></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"202 ","pages":"Article 108054"},"PeriodicalIF":1.5,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0167947324001385/pdfft?md5=cefa2d178122667194291a858ff4b934&pid=1-s2.0-S0167947324001385-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142122374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Robust direction estimation in single-index models via cumulative divergence 通过累积发散在单指数模型中进行稳健的方向估计

IF 1.5 3区数学

Computational Statistics & Data Analysis Pub Date : 2024-08-30 DOI: 10.1016/j.csda.2024.108052

Shuaida He , Jiarui Zhang , Xin Chen

引用次数: 0

A Bayesian cluster validity index 贝叶斯聚类有效性指数

IF 1.5 3区数学

Computational Statistics & Data Analysis Pub Date : 2024-08-30 DOI: 10.1016/j.csda.2024.108053

Onthada Preedasawakul , Nathakhun Wiroonsri

{"title":"A Bayesian cluster validity index","authors":"Onthada Preedasawakul , Nathakhun Wiroonsri","doi":"10.1016/j.csda.2024.108053","DOIUrl":"10.1016/j.csda.2024.108053","url":null,"abstract":"<div><p>Selecting the appropriate number of clusters is a critical step in applying clustering algorithms. To assist in this process, various cluster validity indices (CVIs) have been developed. These indices are designed to identify the optimal number of clusters within a dataset. However, users may not always seek the absolute optimal number of clusters but rather a secondary option that better aligns with their specific applications. This realization has led us to introduce a Bayesian cluster validity index (BCVI), which builds upon existing indices. The BCVI utilizes either Dirichlet or generalized Dirichlet priors, resulting in the same posterior distribution. The proposed BCVI is evaluated using the Calinski-Harabasz, CVNN, Davies–Bouldin, silhouette, Starczewski, and Wiroonsri indices for hard clustering and the KWON2, Wiroonsri–Preedasawakul, and Xie–Beni indices for soft clustering as underlying indices. The performance of the proposed BCVI with that of the original underlying indices has been compared. The BCVI offers clear advantages in situations where user expertise is valuable, allowing users to specify their desired range for the final number of clusters. To illustrate this, experiments classified into three different scenarios are conducted. Additionally, the practical applicability of the proposed approach through real-world datasets, such as MRI brain tumor images are presented. These tools are published as a recent R package ‘BayesCVI’.</p></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"202 ","pages":"Article 108053"},"PeriodicalIF":1.5,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142122373","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0