{"title":"Minimaxity under the half-Cauchy prior","authors":"Yuzo Maruyama , Takeru Matsuda","doi":"10.1016/j.jmva.2025.105431","DOIUrl":"10.1016/j.jmva.2025.105431","url":null,"abstract":"<div><div>This is a follow-up paper of Polson and Scott (2012, Bayesian Analysis), which claimed that the half-Cauchy prior is a sensible default prior for a scale parameter in hierarchical models. For estimation of a <span><math><mi>p</mi></math></span>-variate normal mean under the quadratic loss, they demonstrated that the Bayes estimator with respect to the half-Cauchy prior seems to be minimax through numerical experiments. In this paper, we theoretically establish the minimaxity of the corresponding Bayes estimator using the interval arithmetic.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"208 ","pages":"Article 105431"},"PeriodicalIF":1.4,"publicationDate":"2025-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143562585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On estimation and order selection for multivariate extremes via clustering","authors":"Shiyuan Deng , He Tang , Shuyang Bai","doi":"10.1016/j.jmva.2025.105426","DOIUrl":"10.1016/j.jmva.2025.105426","url":null,"abstract":"<div><div>We investigate the estimation of multivariate extreme models with a discrete spectral measure using spherical clustering techniques. The primary contribution involves devising a method for selecting the order, that is, the number of clusters. The method consistently identifies the true order, i.e., the number of spectral atoms, and enjoys intuitive implementation in practice. Specifically, we introduce an extra penalty term to the well-known simplified average silhouette width, which penalizes small cluster sizes and small dissimilarities between cluster centers. Consequently, we provide a consistent method for determining the order of a max-linear factor model, where a typical information-based approach is not viable. Our second contribution is a large-deviation-type analysis for estimating the discrete spectral measure through clustering methods, which serves as an assessment of the convergence quality of clustering-based estimation for multivariate extremes. Additionally, as a third contribution, we discuss how estimating the discrete measure can lead to parameter estimations of heavy-tailed factor models. We also present simulations and real-data studies that demonstrate order selection and factor model estimation.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"208 ","pages":"Article 105426"},"PeriodicalIF":1.4,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143579875","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Set-valued expectiles for ordered data analysis","authors":"Andreas H. Hamel, Thi Khanh Linh Ha","doi":"10.1016/j.jmva.2025.105425","DOIUrl":"10.1016/j.jmva.2025.105425","url":null,"abstract":"<div><div>Expectile regions–like depth regions in general–capture the idea of centrality of multivariate distributions. If an order relation is present for the values of random vectors and a decision maker is interested in dominant/best points with respect to this order, centrality is not a useful concept. Therefore, cone expectile sets are introduced which depend on a vector preorder generated by a convex cone. This provides a way of describing and clustering a multivariate distribution/data cloud with respect to an order relation. Fundamental properties of cone expectiles are established including dual representations of both expectile regions and cone expectile sets. It is shown that set-valued sublinear risk measures can be constructed from cone expectile sets in the same way as in the univariate case. Inverse functions of cone expectiles are defined which should be considered as ranking functions related to the initial order relation rather than as depth functions. Finally, expectile orders for random vectors are introduced and characterized via expectile ranking functions.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"208 ","pages":"Article 105425"},"PeriodicalIF":1.4,"publicationDate":"2025-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143549608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Ledoit-Wolf linear shrinkage with unknown mean","authors":"Benoît Oriol , Alexandre Miot","doi":"10.1016/j.jmva.2025.105429","DOIUrl":"10.1016/j.jmva.2025.105429","url":null,"abstract":"<div><div>This work addresses large dimensional covariance matrix estimation with unknown mean. The empirical covariance estimator fails when dimension and number of samples are proportional and tend to infinity, settings known as Kolmogorov asymptotics. When the mean is known, Ledoit and Wolf (2004) proposed a linear shrinkage estimator and proved its convergence under those asymptotics. To the best of our knowledge, no formal proof has been proposed when the mean is unknown. To address this issue, we propose to extend the linear shrinkage and its convergence properties to translation-invariant estimators. We expose four estimators respecting those conditions, proving their properties. Finally, we show empirically that a new estimator we propose outperforms other standard estimators.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"208 ","pages":"Article 105429"},"PeriodicalIF":1.4,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143508247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Markov switching multiple-equation tensor regressions","authors":"Roberto Casarin , Radu V. Craiu , Qing Wang","doi":"10.1016/j.jmva.2025.105427","DOIUrl":"10.1016/j.jmva.2025.105427","url":null,"abstract":"<div><div>A new flexible tensor model for multiple-equation regressions that accounts for latent regime changes is proposed. The model allows for dynamic coefficients and multi-dimensional covariates that vary across equations. The coefficients are driven by a common hidden Markov process that addresses structural breaks to enhance the model flexibility and preserve parsimony. A new soft PARAFAC hierarchical prior is introduced to achieve dimensionality reduction while preserving the structural information of the covariate tensor. The proposed prior includes a new multi-way shrinking effect to address over-parametrization issues while preserving interpretability and model tractability. Theoretical results are derived to help with the choice of the hyperparameters. An efficient Markov chain Monte Carlo (MCMC) algorithm based on random scan Gibbs and back-fitting strategy is designed with priority placed on computational scalability of the posterior sampling. The validity of the MCMC algorithm is demonstrated theoretically, and its computational efficiency is studied using numerical experiments in different parameter settings. The effectiveness of the model framework is illustrated using two original real data analyses. The proposed model exhibits superior performance compared to the current benchmark, Lasso regression.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"208 ","pages":"Article 105427"},"PeriodicalIF":1.4,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143510638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Debangan Dey , Sudipto Banerjee , Martin A. Lindquist , Abhirup Datta
{"title":"Graph-constrained analysis for multivariate functional data","authors":"Debangan Dey , Sudipto Banerjee , Martin A. Lindquist , Abhirup Datta","doi":"10.1016/j.jmva.2025.105428","DOIUrl":"10.1016/j.jmva.2025.105428","url":null,"abstract":"<div><div>The manuscript considers multivariate functional data analysis with a known graphical model among the functional variables representing their conditional relationships (e.g., brain region-level fMRI data with a prespecified connectivity graph among brain regions). Functional Gaussian graphical models (GGM) used for analyzing multivariate functional data customarily estimate an unknown graphical model, and cannot preserve knowledge of a given graph. We propose a method for multivariate functional analysis that exactly conforms to a given inter-variable graph. We first show the equivalence between partially separable functional GGM and graphical Gaussian processes (GP), proposed recently for constructing optimal multivariate covariance functions that retain a given graphical model. The theoretical connection helps to design a new algorithm that leverages Dempster’s covariance selection for obtaining the maximum likelihood estimate of the covariance function for multivariate functional data under graphical constraints. We also show that the finite term truncation of functional GGM basis expansion used in practice is equivalent to a low-rank graphical GP, which is known to oversmooth marginal distributions. To remedy this, we extend our algorithm to better preserve marginal distributions while respecting the graph and retaining computational scalability. The benefits of the proposed algorithms are illustrated using empirical experiments and a neuroimaging application.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"207 ","pages":"Article 105428"},"PeriodicalIF":1.4,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143550673","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Francesco Bartolucci , Silvia Pandolfi , Fulvia Pennoni
{"title":"On a class of finite mixture models that includes hidden Markov models","authors":"Francesco Bartolucci , Silvia Pandolfi , Fulvia Pennoni","doi":"10.1016/j.jmva.2025.105423","DOIUrl":"10.1016/j.jmva.2025.105423","url":null,"abstract":"<div><div>In the context of longitudinal data, we introduce a class of finite mixture (FM) models that generalizes that of hidden Markov (HM) models, and derive conditions under which the two classes are equivalent. On the basis of this result, we develop a likelihood ratio (LR) misspecification test for assessing the latent structure of an HM model, along with a multiple version of this test that may be used in the presence of many latent states or time occasions. This testing procedure requires the maximum likelihood estimation of the two models under comparison, that is, the assumed HM model and the more general FM model, which is performed by suitable versions of the Expectation–Maximization algorithm. The approach is validated through a simulation study, aimed at assessing the performance of the proposed tests under different circumstances, and by an application using data derived from the SCImago Journal & Country Rank database.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"208 ","pages":"Article 105423"},"PeriodicalIF":1.4,"publicationDate":"2025-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143445197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Statistical analysis of parsimonious high-order multivariate finite Markov chains based on sufficient statistics","authors":"Yuriy Kharin, Valeriy Voloshko","doi":"10.1016/j.jmva.2025.105422","DOIUrl":"10.1016/j.jmva.2025.105422","url":null,"abstract":"<div><div>A new parsimonious <span><math><mrow><mi>MCSS</mi><mrow><mo>(</mo><mi>s</mi><mo>)</mo></mrow></mrow></math></span> (which stands for “Markov Chain of order <span><math><mi>s</mi></math></span> based on Sufficient Statistics”) model for multivariate discrete-valued time series is constructed. The <span><math><mrow><mi>MCSS</mi><mrow><mo>(</mo><mi>s</mi><mo>)</mo></mrow></mrow></math></span> model has sufficient statistics of a simple form based on multivariate frequencies of <span><math><mrow><mo>(</mo><mi>s</mi><mo>+</mo><mn>1</mn><mo>)</mo></mrow></math></span>-tuples for observed time series. Special cases of the <span><math><mrow><mi>MCSS</mi><mrow><mo>(</mo><mi>s</mi><mo>)</mo></mrow></mrow></math></span> model and their relations to the results known in the literature are discussed. The strong concavity property of the loglikelihood function and the uniqueness of the maximum likelihood estimator under mild regularity conditions are proven for the <span><math><mrow><mi>MCSS</mi><mrow><mo>(</mo><mi>s</mi><mo>)</mo></mrow></mrow></math></span> model. Forecasting statistics for the multivariate discrete-valued time series derived with the <span><math><mrow><mi>MCSS</mi><mrow><mo>(</mo><mi>s</mi><mo>)</mo></mrow></mrow></math></span> model are constructed. The developed theory is illustrated with computer experiments on simulated and real data.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"208 ","pages":"Article 105422"},"PeriodicalIF":1.4,"publicationDate":"2025-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143508246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rosa Arboretti , Elena Barzizza , Nicoló Biasetton , Marta Disegna
{"title":"A review of multivariate permutation tests: Findings and trends","authors":"Rosa Arboretti , Elena Barzizza , Nicoló Biasetton , Marta Disegna","doi":"10.1016/j.jmva.2025.105421","DOIUrl":"10.1016/j.jmva.2025.105421","url":null,"abstract":"<div><div>The permutation test is a widely recognized and frequently used nonparametric hypothesis test, notable for its minimal reliance on assumptions compared to parametric tests. It has found applications in many fields, particularly in multivariate analysis. Since its introduction in the 1930s, permutation tests have been extensively examined both theoretically and empirically. This article provides the results of a comprehensive and systematic review of the literature, focusing on different aspects of multivariate permutation tests. Key articles published in international journals from 2010 onwards have been analyzed, classifying them into four main research strands: data, model, test and issues. These strands were further subdivided into more specific categories. The state of the art and significant developments in this field are summarized, followed by a discussion on future research challenges and trends, offering guidance for the design and development on new approaches.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"207 ","pages":"Article 105421"},"PeriodicalIF":1.4,"publicationDate":"2025-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143387240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Consistency of empirical distributions of sequences of graph statistics in networks with dependent edges","authors":"Jonathan R. Stewart","doi":"10.1016/j.jmva.2025.105420","DOIUrl":"10.1016/j.jmva.2025.105420","url":null,"abstract":"<div><div>One of the first steps in applications of statistical network analysis is frequently to produce summary charts of important features of the network. Many of these features take the form of sequences of graph statistics counting the number of realized events in the network, examples of which are degree distributions, edgewise shared partner distributions, and more. We provide conditions under which the empirical distributions of sequences of graph statistics are consistent in the <span><math><msub><mrow><mi>ℓ</mi></mrow><mrow><mi>∞</mi></mrow></msub></math></span>-norm in settings where edges in the network are dependent. We accomplish this task by deriving concentration inequalities that bound probabilities of deviations of graph statistics from the expected value under weak dependence conditions. We apply our concentration inequalities to empirical distributions of sequences of graph statistics and derive non-asymptotic bounds on the <span><math><msub><mrow><mi>ℓ</mi></mrow><mrow><mi>∞</mi></mrow></msub></math></span>-error which hold with high probability. Our non-asymptotic results are then extended to demonstrate uniform convergence almost surely in selected examples. We illustrate theoretical results through examples, simulation studies, and an application.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"207 ","pages":"Article 105420"},"PeriodicalIF":1.4,"publicationDate":"2025-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143209494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}