Computational Statistics & Data Analysis最新文献

筛选
英文 中文
Beta-CoRM: A Bayesian approach for n-gram profiles analysis Beta-CoRM:用于 n-gram 剖面分析的贝叶斯方法
IF 1.5 3区 数学
Computational Statistics & Data Analysis Pub Date : 2024-09-10 DOI: 10.1016/j.csda.2024.108056
José A. Perusquía , Jim E. Griffin , Cristiano Villa
{"title":"Beta-CoRM: A Bayesian approach for n-gram profiles analysis","authors":"José A. Perusquía ,&nbsp;Jim E. Griffin ,&nbsp;Cristiano Villa","doi":"10.1016/j.csda.2024.108056","DOIUrl":"10.1016/j.csda.2024.108056","url":null,"abstract":"<div><p><em>n</em>-gram profiles have been successfully and widely used to analyse long sequences of potentially differing lengths for clustering or classification. Mainly, machine learning algorithms have been used for this purpose but, despite their predictive performance, these methods cannot discover hidden structures or provide a full probabilistic representation of the data. A novel class of Bayesian generative models designed for <em>n</em>-gram profiles used as binary attributes have been designed to address this. The flexibility of the proposed modelling allows to consider a straightforward approach to feature selection in the generative model. Furthermore, a slice sampling algorithm is derived for a fast inferential procedure, which is applied to synthetic and real data scenarios and shows that feature selection can improve classification accuracy.</p></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"202 ","pages":"Article 108056"},"PeriodicalIF":1.5,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0167947324001403/pdfft?md5=9000ddccd99ed2327e978f13456b5381&pid=1-s2.0-S0167947324001403-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142228880","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Minimum profile Hellinger distance estimation of general covariate models 一般协变量模型的最小剖面海灵格距离估计
IF 1.5 3区 数学
Computational Statistics & Data Analysis Pub Date : 2024-08-30 DOI: 10.1016/j.csda.2024.108054
Bowei Ding , Rohana J. Karunamuni , Jingjing Wu
{"title":"Minimum profile Hellinger distance estimation of general covariate models","authors":"Bowei Ding ,&nbsp;Rohana J. Karunamuni ,&nbsp;Jingjing Wu","doi":"10.1016/j.csda.2024.108054","DOIUrl":"10.1016/j.csda.2024.108054","url":null,"abstract":"<div><p>Covariate models, such as polynomial regression models, generalized linear models, and heteroscedastic models, are widely used in statistical applications. The importance of such models in statistical analysis is abundantly clear by the ever-increasing rate at which articles on covariate models are appearing in the statistical literature. Because of their flexibility, covariate models are increasingly being exploited as a convenient way to model data that consist of both a response variable and one or more covariate variables that affect the outcome of the response variable. Efficient and robust estimates for broadly defined semiparametric covariate models are investigated, and for this purpose the minimum distance approach is employed. In general, minimum distance estimators are automatically robust with respect to the stability of the quantity being estimated. In particular, minimum Hellinger distance estimation for parametric models produces estimators that are asymptotically efficient at the model density and simultaneously possess excellent robustness properties. For semiparametric covariate models, the minimum Hellinger distance method is extended and a minimum profile Hellinger distance estimator is proposed. Its asymptotic properties such as consistency are studied, and its finite-sample performance and robustness are examined by using Monte Carlo simulations and three real data analyses. Additionally, a computing algorithm is developed to ease the computation of the estimator.</p></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"202 ","pages":"Article 108054"},"PeriodicalIF":1.5,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0167947324001385/pdfft?md5=cefa2d178122667194291a858ff4b934&pid=1-s2.0-S0167947324001385-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142122374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robust direction estimation in single-index models via cumulative divergence 通过累积发散在单指数模型中进行稳健的方向估计
IF 1.5 3区 数学
Computational Statistics & Data Analysis Pub Date : 2024-08-30 DOI: 10.1016/j.csda.2024.108052
Shuaida He , Jiarui Zhang , Xin Chen
{"title":"Robust direction estimation in single-index models via cumulative divergence","authors":"Shuaida He ,&nbsp;Jiarui Zhang ,&nbsp;Xin Chen","doi":"10.1016/j.csda.2024.108052","DOIUrl":"10.1016/j.csda.2024.108052","url":null,"abstract":"<div><p>In this paper, we address direction estimation in single-index models, with a focus on heavy-tailed data applications. Our method utilizes cumulative divergence to directly capture the conditional mean dependence between the response variable and the index predictor, resulting in a model-free property that obviates the need for initial link function estimation. Furthermore, our approach allows heavy-tailed predictors and is robust against the presence of outliers, leveraging the rank-based nature of cumulative divergence. We establish theoretical properties for our proposal under mild regularity conditions and illustrate its solid performance through comprehensive simulations and real data analysis.</p></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"202 ","pages":"Article 108052"},"PeriodicalIF":1.5,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142122375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Bayesian cluster validity index 贝叶斯聚类有效性指数
IF 1.5 3区 数学
Computational Statistics & Data Analysis Pub Date : 2024-08-30 DOI: 10.1016/j.csda.2024.108053
Onthada Preedasawakul , Nathakhun Wiroonsri
{"title":"A Bayesian cluster validity index","authors":"Onthada Preedasawakul ,&nbsp;Nathakhun Wiroonsri","doi":"10.1016/j.csda.2024.108053","DOIUrl":"10.1016/j.csda.2024.108053","url":null,"abstract":"<div><p>Selecting the appropriate number of clusters is a critical step in applying clustering algorithms. To assist in this process, various cluster validity indices (CVIs) have been developed. These indices are designed to identify the optimal number of clusters within a dataset. However, users may not always seek the absolute optimal number of clusters but rather a secondary option that better aligns with their specific applications. This realization has led us to introduce a Bayesian cluster validity index (BCVI), which builds upon existing indices. The BCVI utilizes either Dirichlet or generalized Dirichlet priors, resulting in the same posterior distribution. The proposed BCVI is evaluated using the Calinski-Harabasz, CVNN, Davies–Bouldin, silhouette, Starczewski, and Wiroonsri indices for hard clustering and the KWON2, Wiroonsri–Preedasawakul, and Xie–Beni indices for soft clustering as underlying indices. The performance of the proposed BCVI with that of the original underlying indices has been compared. The BCVI offers clear advantages in situations where user expertise is valuable, allowing users to specify their desired range for the final number of clusters. To illustrate this, experiments classified into three different scenarios are conducted. Additionally, the practical applicability of the proposed approach through real-world datasets, such as MRI brain tumor images are presented. These tools are published as a recent R package ‘BayesCVI’.</p></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"202 ","pages":"Article 108053"},"PeriodicalIF":1.5,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142122373","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On the use of the cumulant generating function for inference on time series 关于使用累积生成函数推断时间序列
IF 1.5 3区 数学
Computational Statistics & Data Analysis Pub Date : 2024-08-28 DOI: 10.1016/j.csda.2024.108044
A. Moor, D. La Vecchia, E. Ronchetti
{"title":"On the use of the cumulant generating function for inference on time series","authors":"A. Moor,&nbsp;D. La Vecchia,&nbsp;E. Ronchetti","doi":"10.1016/j.csda.2024.108044","DOIUrl":"10.1016/j.csda.2024.108044","url":null,"abstract":"<div><p>Innovative inference procedures for analyzing time series data are introduced. The methodology covers density approximation and composite hypothesis testing based on Whittle's estimator, which is a widely applied M-estimator in the frequency domain. Its core feature involves the cumulant generating function of Whittle's score obtained using an approximated distribution of the periodogram ordinates. A testing algorithm not only significantly expands the applicability of the state-of-the-art saddlepoint test, but also maintains the numerical accuracy of the saddlepoint approximation. Connections are made with three other prevalent frequency domain techniques: the bootstrap, empirical likelihood, and exponential tilting. Numerical examples using both simulated and real data illustrate the advantages and accuracy of the saddlepoint methods.</p></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"201 ","pages":"Article 108044"},"PeriodicalIF":1.5,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0167947324001282/pdfft?md5=9b20083653468ba252743f2a96727926&pid=1-s2.0-S0167947324001282-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142098072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Minimax rates of convergence for sliced inverse regression with differential privacy 具有微分隐私的切片反回归的最小收敛率
IF 1.5 3区 数学
Computational Statistics & Data Analysis Pub Date : 2024-08-22 DOI: 10.1016/j.csda.2024.108041
Wenbiao Zhao , Xuehu Zhu , Lixing Zhu
{"title":"Minimax rates of convergence for sliced inverse regression with differential privacy","authors":"Wenbiao Zhao ,&nbsp;Xuehu Zhu ,&nbsp;Lixing Zhu","doi":"10.1016/j.csda.2024.108041","DOIUrl":"10.1016/j.csda.2024.108041","url":null,"abstract":"<div><p>Sliced inverse regression (SIR) is a highly efficient paradigm used for the purpose of dimension reduction by replacing high-dimensional covariates with a limited number of linear combinations. This paper focuses on the implementation of the classical SIR approach integrated with a Gaussian differential privacy mechanism to estimate the central space while preserving privacy. We illustrate the tradeoff between statistical accuracy and privacy in sufficient dimension reduction problems under both the classical low- dimensional and modern high-dimensional settings. Additionally, we achieve the minimax rate of the proposed estimator with Gaussian differential privacy constraint and illustrate that this rate is also optimal for multiple index models with bounded dimension of the central space. Extensive numerical studies on synthetic data sets are conducted to assess the effectiveness of the proposed technique in finite sample scenarios, and a real data analysis is presented to showcase its practical application.</p></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"201 ","pages":"Article 108041"},"PeriodicalIF":1.5,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0167947324001257/pdfft?md5=cab1d33929cc2c1071e939e0580ca683&pid=1-s2.0-S0167947324001257-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142084124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Test for the mean of high-dimensional functional time series 高维函数时间序列均值检验
IF 1.5 3区 数学
Computational Statistics & Data Analysis Pub Date : 2024-08-22 DOI: 10.1016/j.csda.2024.108040
Lin Yang , Zhenghui Feng , Qing Jiang
{"title":"Test for the mean of high-dimensional functional time series","authors":"Lin Yang ,&nbsp;Zhenghui Feng ,&nbsp;Qing Jiang","doi":"10.1016/j.csda.2024.108040","DOIUrl":"10.1016/j.csda.2024.108040","url":null,"abstract":"<div><p>The one-sample test and two-sample test for the mean of high-dimensional functional time series are considered in this study. The proposed tests are built on the dimension-wise max-norm of the sum of squares of diverging projections. The null distribution of the test statistics is investigated using normal approximation, and the asymptotic behavior under the alternative is studied. The approach is robust to the cross-series dependence of unknown forms and magnitude. To approximate the critical values, a blockwise wild bootstrap method for functional time series is employed. Both fully and partially observed data are analyzed in theoretical research and numerical studies. Evidence from simulation studies and an IT stock data case study demonstrates the usefulness of the test in practice. The proposed methods have been implemented in a R package.</p></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"201 ","pages":"Article 108040"},"PeriodicalIF":1.5,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0167947324001245/pdfft?md5=a3ba37187b9ba57e45af87f61b64c9c8&pid=1-s2.0-S0167947324001245-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142084125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Community influence analysis in social networks 社交网络中的社区影响力分析
IF 1.5 3区 数学
Computational Statistics & Data Analysis Pub Date : 2024-08-22 DOI: 10.1016/j.csda.2024.108037
Yuanxing Chen , Kuangnan Fang , Wei Lan , Chih-Ling Tsai , Qingzhao Zhang
{"title":"Community influence analysis in social networks","authors":"Yuanxing Chen ,&nbsp;Kuangnan Fang ,&nbsp;Wei Lan ,&nbsp;Chih-Ling Tsai ,&nbsp;Qingzhao Zhang","doi":"10.1016/j.csda.2024.108037","DOIUrl":"10.1016/j.csda.2024.108037","url":null,"abstract":"<div><p>Heterogeneous influence detection across network nodes is an important task in network analysis. A community influence model (CIM) is proposed to allow nodes to be classified into different communities (i.e., clusters or groups) such that the nodes within the same community share the common influence parameter. Employing the quasi-maximum likelihood approach, together with the fused lasso-type penalty, both the number of communities and the influence parameters can be estimated without imposing any specific distribution assumption on the error terms. The resulting estimators are shown to enjoy the oracle property; namely, they perform as well as if the true underlying network structure were known in advance. The proposed approach is also applicable for identifying influential nodes in a homogeneous setting. The performance of our method is illustrated via simulation studies and two empirical examples using stock data and coauthor citation data, respectively.</p></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"202 ","pages":"Article 108037"},"PeriodicalIF":1.5,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142129311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Feasible model-based principal component analysis: Joint estimation of rank and error covariance matrix 基于模型的可行主成分分析:秩和误差协方差矩阵的联合估计
IF 1.5 3区 数学
Computational Statistics & Data Analysis Pub Date : 2024-08-22 DOI: 10.1016/j.csda.2024.108042
Tak-Shing T. Chan, Alex Gibberd
{"title":"Feasible model-based principal component analysis: Joint estimation of rank and error covariance matrix","authors":"Tak-Shing T. Chan,&nbsp;Alex Gibberd","doi":"10.1016/j.csda.2024.108042","DOIUrl":"10.1016/j.csda.2024.108042","url":null,"abstract":"<div><p>Real-world inputs to principal component analysis are often corrupted by temporally or spatially correlated errors. There are several methods to mitigate this, e.g., generalized least-square matrix decomposition and maximum likelihood approaches; however, they all require that the number of components or the error covariances to be known in advance, rendering the methods infeasible. To address this issue, a novel method is developed which estimates the number of components and the error covariances at the same time. The method is based on working covariance models, an idea adapted from generalized estimating equations, where the user only specifies the structural form of the error covariances. If the structural form is also unknown, working covariance selection can be used to search for the best structure from a user-defined library. Experiments on synthetic and real data confirm the efficacy of the proposed approach.</p></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"201 ","pages":"Article 108042"},"PeriodicalIF":1.5,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0167947324001269/pdfft?md5=ac444320856de4406b797dc038c23d54&pid=1-s2.0-S0167947324001269-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142121718","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hierarchical Bayesian spectral regression with shape constraints for multi-group data 针对多组数据的带形状约束的分层贝叶斯光谱回归
IF 1.5 3区 数学
Computational Statistics & Data Analysis Pub Date : 2024-08-08 DOI: 10.1016/j.csda.2024.108036
Peter Lenk , Jangwon Lee , Dongu Han , Jichan Park , Taeryon Choi
{"title":"Hierarchical Bayesian spectral regression with shape constraints for multi-group data","authors":"Peter Lenk ,&nbsp;Jangwon Lee ,&nbsp;Dongu Han ,&nbsp;Jichan Park ,&nbsp;Taeryon Choi","doi":"10.1016/j.csda.2024.108036","DOIUrl":"10.1016/j.csda.2024.108036","url":null,"abstract":"<div><p>We propose a hierarchical Bayesian (HB) model for multi-group analysis with group–specific, flexible regression functions. The lower–level (within group) and upper–level (between groups) regression functions have hierarchical Gaussian process priors. HB smoothing priors are developed for the spectral coefficients. The HB priors smooth the estimated functions within and between groups. The HB model is particularly useful when data within groups are sparse because it shares information across groups, and provides more accurate estimates than fitting separate nonparametric models to each group. The proposed model also allows shape constraints, such as monotone, U and S–shaped, and multi-modal constraints. When appropriate, shape constraints improve estimation by recognizing violations of the shape constraints as noise. The model is illustrated by two examples: monotone growth curves for children, and happiness as a convex, U-shaped function of age in multiple countries. Various basis functions could also be used, and the paper also implements versions with B-splines and orthogonal polynomials.</p></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"200 ","pages":"Article 108036"},"PeriodicalIF":1.5,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141979432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信