Computational Statistics最新文献

筛选
英文 中文
High dimensional controlled variable selection with model-X knockoffs in the AFT model 在 AFT 模型中使用 X 模型山寨版进行高维受控变量选择
IF 1.3 4区 数学
Computational Statistics Pub Date : 2023-12-09 DOI: 10.1007/s00180-023-01426-5
Baihua He, Di Xia, Yingli Pan
{"title":"High dimensional controlled variable selection with model-X knockoffs in the AFT model","authors":"Baihua He, Di Xia, Yingli Pan","doi":"10.1007/s00180-023-01426-5","DOIUrl":"https://doi.org/10.1007/s00180-023-01426-5","url":null,"abstract":"<p>Interpretability and stability are two important characteristics required for the application of high dimensional data in statistics. Although the former has been favored by many existing forecasting methods to some extent, the latter in the sense of controlling the fraction of wrongly discovered features is still largely underdeveloped. Under the accelerated failure time model, this paper introduces a controlled variable selection method with the general framework of Model-X knockoffs to tackle high dimensional data. We provide theoretical justifications on the asymptotic false discovery rate (FDR) control. The proposed method has attracted significant interest due to its strong control of the FDR while preserving predictive power. Several simulation examples are conducted to assess the finite sample performance with desired interpretability and stability. A real data example from Acute Myeloid Leukemia study is analyzed to demonstrate the utility of the proposed method in practice.</p>","PeriodicalId":55223,"journal":{"name":"Computational Statistics","volume":"23 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2023-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138563591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dimension reduction and visualization of multiple time series data: a symbolic data analysis approach 多时间序列数据的降维与可视化:一种符号数据分析方法
IF 1.3 4区 数学
Computational Statistics Pub Date : 2023-12-06 DOI: 10.1007/s00180-023-01440-7
Emily Chia-Yu Su, Han-Ming Wu
{"title":"Dimension reduction and visualization of multiple time series data: a symbolic data analysis approach","authors":"Emily Chia-Yu Su, Han-Ming Wu","doi":"10.1007/s00180-023-01440-7","DOIUrl":"https://doi.org/10.1007/s00180-023-01440-7","url":null,"abstract":"<p>Exploratory analysis and visualization of multiple time series data are essential for discovering the underlying dynamics of a series before attempting modeling and forecasting. This study extends two dimension reduction methods - principal component analysis (PCA) and sliced inverse regression (SIR) - to multiple time series data. This is achieved through the innovative path point approach, a new addition to the symbolic data analysis framework. By transforming multiple time series data into time-dependent intervals marked by starting and ending values, each series is geometrically represented as successive directed segments with unique path points. These path points serve as the foundation of our novel representation approach. PCA and SIR are then applied to the data table formed by the coordinates of these path points, enabling visualization of temporal trajectories of objects within a reduced-dimensional subspace. Empirical studies encompassing simulations, microarray time series data from a yeast cell cycle, and financial data confirm the effectiveness of our path point approach in revealing the structure and behavior of objects within a 2D factorial plane. Comparative analyses with existing methods, such as the applied vector approach for PCA and SIR on time-dependent interval data, further underscore the strength and versatility of our path point representation in the realm of time series data.</p>","PeriodicalId":55223,"journal":{"name":"Computational Statistics","volume":"93 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2023-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138548069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An expectation maximization algorithm for the hidden markov models with multiparameter student-t observations 具有多参数student-t观测值的隐马尔可夫模型期望最大化算法
IF 1.3 4区 数学
Computational Statistics Pub Date : 2023-12-06 DOI: 10.1007/s00180-023-01432-7
Emna Ghorbel, Mahdi Louati
{"title":"An expectation maximization algorithm for the hidden markov models with multiparameter student-t observations","authors":"Emna Ghorbel, Mahdi Louati","doi":"10.1007/s00180-023-01432-7","DOIUrl":"https://doi.org/10.1007/s00180-023-01432-7","url":null,"abstract":"<p>Hidden Markov models are a class of probabilistic graphical models used to describe the evolution of a sequence of unknown variables from a set of observed variables. They are statistical models introduced by Baum and Petrie in Baum (JMA 101:789–810) and belong to the class of latent variable models. Initially developed and applied in the context of speech recognition, they have attracted much attention in many fields of application. The central objective of this research work is upon an extension of these models. More accurately, we define multiparameter hidden Markov models, using multiple observation processes and the Riesz distribution on the space of symmetric matrices as a natural extension of the gamma one. Some basic related properties are discussed and marginal and posterior distributions are derived. We conduct the Forward-Backward dynamic programming algorithm and the classical Expectation Maximization algorithm to estimate the global set of parameters. Using simulated data, the performance of these estimators is conveniently achieved by the Matlab program. This allows us to assess the quality of the proposed estimators by means of the mean square errors between the true and the estimated values.</p>","PeriodicalId":55223,"journal":{"name":"Computational Statistics","volume":" 8","pages":""},"PeriodicalIF":1.3,"publicationDate":"2023-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138493829","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sequential linear regression for conditional mean imputation of longitudinal continuous outcomes under reference-based assumptions 参考假设下纵向连续结果条件均值估算的序贯线性回归
IF 1.3 4区 数学
Computational Statistics Pub Date : 2023-12-03 DOI: 10.1007/s00180-023-01439-0
Sean Yiu
{"title":"Sequential linear regression for conditional mean imputation of longitudinal continuous outcomes under reference-based assumptions","authors":"Sean Yiu","doi":"10.1007/s00180-023-01439-0","DOIUrl":"https://doi.org/10.1007/s00180-023-01439-0","url":null,"abstract":"<p>In clinical trials of longitudinal continuous outcomes, reference based imputation (RBI) has commonly been applied to handle missing outcome data in settings where the estimand incorporates the effects of intercurrent events, e.g. treatment discontinuation. RBI was originally developed in the multiple imputation framework, however recently conditional mean imputation (CMI) combined with the jackknife estimator of the standard error was proposed as a way to obtain deterministic treatment effect estimates and correct frequentist inference. For both multiple and CMI, a mixed model for repeated measures (MMRM) is often used for the imputation model, but this can be computationally intensive to fit to multiple data sets (e.g. the jackknife samples) and lead to convergence issues with complex MMRM models with many parameters. Therefore, a step-wise approach based on sequential linear regression (SLR) of the outcomes at each visit was developed for the imputation model in the multiple imputation framework, but similar developments in the CMI framework are lacking. In this article, we fill this gap in the literature by proposing a SLR approach to implement RBI in the CMI framework, and justify its validity using theoretical results and simulations. We also illustrate our proposal on a real data application.</p>","PeriodicalId":55223,"journal":{"name":"Computational Statistics","volume":" 9","pages":""},"PeriodicalIF":1.3,"publicationDate":"2023-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138493828","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Pair programming with ChatGPT for sampling and estimation of copulas 用ChatGPT进行结对编程的抽样和估计
IF 1.3 4区 数学
Computational Statistics Pub Date : 2023-12-01 DOI: 10.1007/s00180-023-01437-2
Jan Górecki
{"title":"Pair programming with ChatGPT for sampling and estimation of copulas","authors":"Jan Górecki","doi":"10.1007/s00180-023-01437-2","DOIUrl":"https://doi.org/10.1007/s00180-023-01437-2","url":null,"abstract":"<p>Without writing a single line of code by a human, an example Monte Carlo simulation-based application for stochastic dependence modeling with copulas is developed through pair programming involving a human partner and a large language model (LLM) fine-tuned for conversations. This process encompasses interacting with ChatGPT using both natural language and mathematical formalism. Under the careful supervision of a human expert, this interaction facilitated the creation of functioning code in MATLAB, Python, and <span>R</span>. The code performs a variety of tasks including sampling from a given copula model, evaluating the model’s density, conducting maximum likelihood estimation, optimizing for parallel computing on CPUs and GPUs, and visualizing the computed results. In contrast to other emerging studies that assess the accuracy of LLMs like ChatGPT on tasks from a selected area, this work rather investigates ways how to achieve a successful solution of a standard statistical task in a collaboration of a human expert and artificial intelligence (AI). Particularly, through careful prompt engineering, we separate successful solutions generated by ChatGPT from unsuccessful ones, resulting in a comprehensive list of related pros and cons. It is demonstrated that if the typical pitfalls are avoided, we can substantially benefit from collaborating with an AI partner. For example, we show that if ChatGPT is not able to provide a correct solution due to a lack of or incorrect knowledge, the human-expert can feed it with the correct knowledge, e.g., in the form of mathematical theorems and formulas, and make it to apply the gained knowledge in order to provide a correct solution. Such ability presents an attractive opportunity to achieve a programmed solution even for users with rather limited knowledge of programming techniques.</p>","PeriodicalId":55223,"journal":{"name":"Computational Statistics","volume":"26 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138516699","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Wavelet-based Bayesian approximate kernel method for high-dimensional data analysis 基于小波的贝叶斯近似核方法用于高维数据分析
IF 1.3 4区 数学
Computational Statistics Pub Date : 2023-11-26 DOI: 10.1007/s00180-023-01438-1
Wenxing Guo, Xueying Zhang, Bei Jiang, Linglong Kong, Yaozhong Hu
{"title":"Wavelet-based Bayesian approximate kernel method for high-dimensional data analysis","authors":"Wenxing Guo, Xueying Zhang, Bei Jiang, Linglong Kong, Yaozhong Hu","doi":"10.1007/s00180-023-01438-1","DOIUrl":"https://doi.org/10.1007/s00180-023-01438-1","url":null,"abstract":"<p>Kernel methods are often used for nonlinear regression and classification in statistics and machine learning because they are computationally cheap and accurate. The wavelet kernel functions based on wavelet analysis can efficiently approximate any nonlinear functions. In this article, we construct a novel wavelet kernel function in terms of random wavelet bases and define a linear vector space that captures nonlinear structures in reproducing kernel Hilbert spaces (RKHS). Based on the wavelet transform, the data are mapped into a low-dimensional randomized feature space and convert kernel function into operations of a linear machine. We then propose a new Bayesian approximate kernel model with the random wavelet expansion and use the Gibbs sampler to compute the model’s parameters. Finally, some simulation studies and two real datasets analyses are carried out to demonstrate that the proposed method displays good stability, prediction performance compared to some other existing methods.</p>","PeriodicalId":55223,"journal":{"name":"Computational Statistics","volume":"49 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2023-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138516646","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Two-sample Behrens–Fisher problems for high-dimensional data: a normal reference F-type test 高维数据的双样本Behrens-Fisher问题:一个正常的参考f型检验
IF 1.3 4区 数学
Computational Statistics Pub Date : 2023-11-24 DOI: 10.1007/s00180-023-01433-6
Tianming Zhu, Pengfei Wang, Jin-Ting Zhang
{"title":"Two-sample Behrens–Fisher problems for high-dimensional data: a normal reference F-type test","authors":"Tianming Zhu, Pengfei Wang, Jin-Ting Zhang","doi":"10.1007/s00180-023-01433-6","DOIUrl":"https://doi.org/10.1007/s00180-023-01433-6","url":null,"abstract":"<p>The problem of testing the equality of mean vectors for high-dimensional data has been intensively investigated in the literature. However, most of the existing tests impose strong assumptions on the underlying group covariance matrices which may not be satisfied or hardly be checked in practice. In this article, an <i>F</i>-type test for two-sample Behrens–Fisher problems for high-dimensional data is proposed and studied. When the two samples are normally distributed and when the null hypothesis is valid, the proposed <i>F</i>-type test statistic is shown to be an <i>F</i>-type mixture, a ratio of two independent <span>(chi ^2)</span>-type mixtures. Under some regularity conditions and the null hypothesis, it is shown that the proposed <i>F</i>-type test statistic and the above <i>F</i>-type mixture have the same normal and non-normal limits. It is then justified to approximate the null distribution of the proposed <i>F</i>-type test statistic by that of the <i>F</i>-type mixture, resulting in the so-called normal reference <i>F</i>-type test. Since the <i>F</i>-type mixture is a ratio of two independent <span>(chi ^2)</span>-type mixtures, we employ the Welch–Satterthwaite <span>(chi ^2)</span>-approximation to the distributions of the numerator and the denominator of the <i>F</i>-type mixture respectively, resulting in an approximation <i>F</i>-distribution whose degrees of freedom can be consistently estimated from the data. The asymptotic power of the proposed <i>F</i>-type test is established. Two simulation studies are conducted and they show that in terms of size control, the proposed <i>F</i>-type test outperforms two existing competitors. The good performance of the proposed <i>F</i>-type test is also illustrated by a COVID-19 data example.</p>","PeriodicalId":55223,"journal":{"name":"Computational Statistics","volume":"18 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2023-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138516672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A new bandwidth selection method for nonparametric modal regression based on generalized hyperbolic distributions 基于广义双曲分布的非参数模态回归带宽选择新方法
IF 1.3 4区 数学
Computational Statistics Pub Date : 2023-11-18 DOI: 10.1007/s00180-023-01435-4
Hongpeng Yuan, Sijia Xiang, Weixin Yao
{"title":"A new bandwidth selection method for nonparametric modal regression based on generalized hyperbolic distributions","authors":"Hongpeng Yuan, Sijia Xiang, Weixin Yao","doi":"10.1007/s00180-023-01435-4","DOIUrl":"https://doi.org/10.1007/s00180-023-01435-4","url":null,"abstract":"<p>As a complement to standard mean and quantile regression, nonparametric modal regression has been broadly applied in various fields. By focusing on the most likely conditional value of Y given x, the nonparametric modal regression is shown to be resistant to outliers and some forms of measurement error, and the prediction intervals are shorter when data is skewed. However, the bandwidth selection is critical but very challenging, since the traditional least-squares based cross-validation method cannot be applied. We propose to select the bandwidth by applying the asymptotic global optimal bandwidth and the flexible generalized hyperbolic (GH) distribution as the distribution of the error. Unlike the plug-in method, the new method does not require preliminary parameters to be chosen in advance, is easy to compute by any statistical software, and is computationally efficient compared to the existing kernel density estimator (KDE) based method. Numerical studies show that the GH based bandwidth performs better than existing bandwidth selector, in terms of higher coverage probabilities. Real data applications also illustrate the superior performance of the new bandwidth.</p>","PeriodicalId":55223,"journal":{"name":"Computational Statistics","volume":"22 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2023-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138516650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Simultaneous subgroup identification and variable selection for high dimensional data 高维数据的同时子群识别和变量选择
IF 1.3 4区 数学
Computational Statistics Pub Date : 2023-11-17 DOI: 10.1007/s00180-023-01436-3
Huicong Yu, Jiaqi Wu, Weiping Zhang
{"title":"Simultaneous subgroup identification and variable selection for high dimensional data","authors":"Huicong Yu, Jiaqi Wu, Weiping Zhang","doi":"10.1007/s00180-023-01436-3","DOIUrl":"https://doi.org/10.1007/s00180-023-01436-3","url":null,"abstract":"<p>The high dimensionality of genetic data poses many challenges for subgroup identification, both computationally and theoretically. This paper proposes a double-penalized regression model for subgroup analysis and variable selection for heterogeneous high-dimensional data. The proposed approach can automatically identify the underlying subgroups, recover the sparsity, and simultaneously estimate all regression coefficients without prior knowledge of grouping structure or sparsity construction within variables. We optimize the objective function using the alternating direction method of multipliers with a proximal gradient algorithm and demonstrate the convergence of the proposed procedure. We show that the proposed estimator enjoys the oracle property. Simulation studies demonstrate the effectiveness of the novel method with finite samples, and a real data example is provided for illustration.</p>","PeriodicalId":55223,"journal":{"name":"Computational Statistics","volume":"47 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2023-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138516645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Nonparametric estimation of expected shortfall for α-mixing financial losses α-混合财务损失预期缺口的非参数估计
4区 数学
Computational Statistics Pub Date : 2023-11-14 DOI: 10.1007/s00180-023-01434-5
Xuejun Wang, Yi Wu, Wei Wang
{"title":"Nonparametric estimation of expected shortfall for α-mixing financial losses","authors":"Xuejun Wang, Yi Wu, Wei Wang","doi":"10.1007/s00180-023-01434-5","DOIUrl":"https://doi.org/10.1007/s00180-023-01434-5","url":null,"abstract":"","PeriodicalId":55223,"journal":{"name":"Computational Statistics","volume":"27 20","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134991778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信