{"title":"On the use of the cumulant generating function for inference on time series","authors":"A. Moor, D. La Vecchia, E. Ronchetti","doi":"10.1016/j.csda.2024.108044","DOIUrl":"10.1016/j.csda.2024.108044","url":null,"abstract":"<div><p>Innovative inference procedures for analyzing time series data are introduced. The methodology covers density approximation and composite hypothesis testing based on Whittle's estimator, which is a widely applied M-estimator in the frequency domain. Its core feature involves the cumulant generating function of Whittle's score obtained using an approximated distribution of the periodogram ordinates. A testing algorithm not only significantly expands the applicability of the state-of-the-art saddlepoint test, but also maintains the numerical accuracy of the saddlepoint approximation. Connections are made with three other prevalent frequency domain techniques: the bootstrap, empirical likelihood, and exponential tilting. Numerical examples using both simulated and real data illustrate the advantages and accuracy of the saddlepoint methods.</p></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"201 ","pages":"Article 108044"},"PeriodicalIF":1.5,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0167947324001282/pdfft?md5=9b20083653468ba252743f2a96727926&pid=1-s2.0-S0167947324001282-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142098072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Minimax rates of convergence for sliced inverse regression with differential privacy","authors":"Wenbiao Zhao , Xuehu Zhu , Lixing Zhu","doi":"10.1016/j.csda.2024.108041","DOIUrl":"10.1016/j.csda.2024.108041","url":null,"abstract":"<div><p>Sliced inverse regression (SIR) is a highly efficient paradigm used for the purpose of dimension reduction by replacing high-dimensional covariates with a limited number of linear combinations. This paper focuses on the implementation of the classical SIR approach integrated with a Gaussian differential privacy mechanism to estimate the central space while preserving privacy. We illustrate the tradeoff between statistical accuracy and privacy in sufficient dimension reduction problems under both the classical low- dimensional and modern high-dimensional settings. Additionally, we achieve the minimax rate of the proposed estimator with Gaussian differential privacy constraint and illustrate that this rate is also optimal for multiple index models with bounded dimension of the central space. Extensive numerical studies on synthetic data sets are conducted to assess the effectiveness of the proposed technique in finite sample scenarios, and a real data analysis is presented to showcase its practical application.</p></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"201 ","pages":"Article 108041"},"PeriodicalIF":1.5,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0167947324001257/pdfft?md5=cab1d33929cc2c1071e939e0580ca683&pid=1-s2.0-S0167947324001257-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142084124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Test for the mean of high-dimensional functional time series","authors":"Lin Yang , Zhenghui Feng , Qing Jiang","doi":"10.1016/j.csda.2024.108040","DOIUrl":"10.1016/j.csda.2024.108040","url":null,"abstract":"<div><p>The one-sample test and two-sample test for the mean of high-dimensional functional time series are considered in this study. The proposed tests are built on the dimension-wise max-norm of the sum of squares of diverging projections. The null distribution of the test statistics is investigated using normal approximation, and the asymptotic behavior under the alternative is studied. The approach is robust to the cross-series dependence of unknown forms and magnitude. To approximate the critical values, a blockwise wild bootstrap method for functional time series is employed. Both fully and partially observed data are analyzed in theoretical research and numerical studies. Evidence from simulation studies and an IT stock data case study demonstrates the usefulness of the test in practice. The proposed methods have been implemented in a R package.</p></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"201 ","pages":"Article 108040"},"PeriodicalIF":1.5,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0167947324001245/pdfft?md5=a3ba37187b9ba57e45af87f61b64c9c8&pid=1-s2.0-S0167947324001245-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142084125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Community influence analysis in social networks","authors":"Yuanxing Chen , Kuangnan Fang , Wei Lan , Chih-Ling Tsai , Qingzhao Zhang","doi":"10.1016/j.csda.2024.108037","DOIUrl":"10.1016/j.csda.2024.108037","url":null,"abstract":"<div><p>Heterogeneous influence detection across network nodes is an important task in network analysis. A community influence model (CIM) is proposed to allow nodes to be classified into different communities (i.e., clusters or groups) such that the nodes within the same community share the common influence parameter. Employing the quasi-maximum likelihood approach, together with the fused lasso-type penalty, both the number of communities and the influence parameters can be estimated without imposing any specific distribution assumption on the error terms. The resulting estimators are shown to enjoy the oracle property; namely, they perform as well as if the true underlying network structure were known in advance. The proposed approach is also applicable for identifying influential nodes in a homogeneous setting. The performance of our method is illustrated via simulation studies and two empirical examples using stock data and coauthor citation data, respectively.</p></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"202 ","pages":"Article 108037"},"PeriodicalIF":1.5,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142129311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Feasible model-based principal component analysis: Joint estimation of rank and error covariance matrix","authors":"Tak-Shing T. Chan, Alex Gibberd","doi":"10.1016/j.csda.2024.108042","DOIUrl":"10.1016/j.csda.2024.108042","url":null,"abstract":"<div><p>Real-world inputs to principal component analysis are often corrupted by temporally or spatially correlated errors. There are several methods to mitigate this, e.g., generalized least-square matrix decomposition and maximum likelihood approaches; however, they all require that the number of components or the error covariances to be known in advance, rendering the methods infeasible. To address this issue, a novel method is developed which estimates the number of components and the error covariances at the same time. The method is based on working covariance models, an idea adapted from generalized estimating equations, where the user only specifies the structural form of the error covariances. If the structural form is also unknown, working covariance selection can be used to search for the best structure from a user-defined library. Experiments on synthetic and real data confirm the efficacy of the proposed approach.</p></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"201 ","pages":"Article 108042"},"PeriodicalIF":1.5,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0167947324001269/pdfft?md5=ac444320856de4406b797dc038c23d54&pid=1-s2.0-S0167947324001269-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142121718","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Peter Lenk , Jangwon Lee , Dongu Han , Jichan Park , Taeryon Choi
{"title":"Hierarchical Bayesian spectral regression with shape constraints for multi-group data","authors":"Peter Lenk , Jangwon Lee , Dongu Han , Jichan Park , Taeryon Choi","doi":"10.1016/j.csda.2024.108036","DOIUrl":"10.1016/j.csda.2024.108036","url":null,"abstract":"<div><p>We propose a hierarchical Bayesian (HB) model for multi-group analysis with group–specific, flexible regression functions. The lower–level (within group) and upper–level (between groups) regression functions have hierarchical Gaussian process priors. HB smoothing priors are developed for the spectral coefficients. The HB priors smooth the estimated functions within and between groups. The HB model is particularly useful when data within groups are sparse because it shares information across groups, and provides more accurate estimates than fitting separate nonparametric models to each group. The proposed model also allows shape constraints, such as monotone, U and S–shaped, and multi-modal constraints. When appropriate, shape constraints improve estimation by recognizing violations of the shape constraints as noise. The model is illustrated by two examples: monotone growth curves for children, and happiness as a convex, U-shaped function of age in multiple countries. Various basis functions could also be used, and the paper also implements versions with B-splines and orthogonal polynomials.</p></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"200 ","pages":"Article 108036"},"PeriodicalIF":1.5,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141979432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimal splitk-plot designs","authors":"Mathias Born , Peter Goos","doi":"10.1016/j.csda.2024.108028","DOIUrl":"10.1016/j.csda.2024.108028","url":null,"abstract":"<div><p>Completely randomized designs are often infeasible due to the hard-to-change nature of one or more experimental factors. In those cases, restrictions are imposed on the order of the experimental tests. The resulting experimental designs are often split-plot or split-split-plot designs in which the levels of certain hard-to-change factors are varied only a limited number of times. In agricultural machinery optimization, the number of hard-to-change factors is so large and the available time for experimentation is so short that split-plot or split-split-plot designs are infeasible as well. The only feasible kinds of designs are generalizations of split-split-plot designs, which are referred to as split<sup><em>k</em></sup>-designs, where <em>k</em> is larger than 2. The coordinate-exchange algorithm is extended to construct optimal split<sup><em>k</em></sup>-plot designs and the added value of the algorithm is demonstrated by applying it to an experiment involving a self propelled forage harvester. The optimal design generated using the extended algorithm is substantially more efficient than the design that was actually used. Update formulas for the determinant and the inverse of the information matrix speed up the coordinate-exchange algorithm, making it feasible for large designs.</p></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"201 ","pages":"Article 108028"},"PeriodicalIF":1.5,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0167947324001129/pdfft?md5=a6856543c46f3f3fa3089527fd43efb7&pid=1-s2.0-S0167947324001129-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142075844","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Pinball boosting of regression quantiles","authors":"Ida Bauer , Harry Haupt , Stefan Linner","doi":"10.1016/j.csda.2024.108027","DOIUrl":"10.1016/j.csda.2024.108027","url":null,"abstract":"<div><p>An algorithm for boosting regression quantiles using asymmetric least absolute deviations, better known as pinball loss, is proposed. Existing approaches for boosting regression quantiles are essentially equal to least squares boosting of regression means with the single difference that their working residuals are based on pinball loss. All steps of our boosting algorithm are embedded in the well-established framework of quantile regression, and its main components – sequential base learning, fitting, and updating – are based on consistent scoring rules for regression quantiles. The Monte Carlo simulations performed indicate that the pinball boosting algorithm is competitive with existing approaches for boosting regression quantiles in terms of estimation accuracy and variable selection, and that its application to the study of regression quantiles of hedonic price functions allows the estimation of previously infeasible high-dimensional specifications.</p></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"200 ","pages":"Article 108027"},"PeriodicalIF":1.5,"publicationDate":"2024-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0167947324001117/pdfft?md5=a5bb1b64a0df9825011d53531f3280e4&pid=1-s2.0-S0167947324001117-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141947257","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mehrdad Naderi , Mostafa Tamandi , Elham Mirfarah , Wan-Lun Wang , Tsung-I Lin
{"title":"Three-way data clustering based on the mean-mixture of matrix-variate normal distributions","authors":"Mehrdad Naderi , Mostafa Tamandi , Elham Mirfarah , Wan-Lun Wang , Tsung-I Lin","doi":"10.1016/j.csda.2024.108016","DOIUrl":"10.1016/j.csda.2024.108016","url":null,"abstract":"<div><p>With the steady growth of computer technologies, the application of statistical techniques to analyze extensive datasets has garnered substantial attention. The analysis of three-way (matrix-variate) data has emerged as a burgeoning field that has inspired statisticians in recent years to develop novel analytical methods. This paper introduces a unified finite mixture model that relies on the mean-mixture of matrix-variate normal distributions. The strength of our proposed model lies in its capability to capture and cluster a wide range of three-way data that exhibit heterogeneous, asymmetric and leptokurtic features. A computationally feasible ECME algorithm is developed to compute the maximum likelihood (ML) estimates. Numerous simulation studies are conducted to investigate the asymptotic properties of the ML estimators, validate the effectiveness of the Bayesian information criterion in selecting the appropriate model, and assess the classification ability in presence of contaminated noise. The utility of the proposed methodology is demonstrated by analyzing a real-life data example.</p></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"199 ","pages":"Article 108016"},"PeriodicalIF":1.5,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141947240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Tests for high-dimensional generalized linear models under general covariance structure","authors":"Weichao Yang , Xu Guo , Lixing Zhu","doi":"10.1016/j.csda.2024.108026","DOIUrl":"10.1016/j.csda.2024.108026","url":null,"abstract":"<div><p>This study investigates the testing of regression coefficients within high-dimensional generalized linear models featuring general covariance structures. The derived asymptotic properties reveal that distinct covariance structures can lead to varying limiting null distributions, including the normal distribution, for a widely employed quadratic-norm based test statistic. This circumstance renders it infeasible to determine critical values through a limiting null distribution. In response to this challenge, we propose a multiplier bootstrap test procedure for practical implementation. Additionally, we introduce a modified version of this procedure, incorporating projection when dealing with nuisance parameters. We then proceed to examine the asymptotic level and power of the proposed tests and assess their finite-sample performance through simulations. Finally, we present a real data analysis to illustrate the practical application of the proposed tests.</p></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"199 ","pages":"Article 108026"},"PeriodicalIF":1.5,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141728824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}