{"title":"Conformal prediction for multivariate responses with Euclidean likelihood","authors":"Feichen Gan, Yukun Liu","doi":"10.1016/j.jmva.2025.105494","DOIUrl":"10.1016/j.jmva.2025.105494","url":null,"abstract":"<div><div>Multivariate response analysis offers a more comprehensive understanding of the phenomena being studied than univariate response analysis. Despite the widespread popularity of conformal inference for a univariate response, relatively little research has been conducted on its application to multivariate responses. In this paper, we propose a novel conformal prediction method for multivariate response by taking the Euclidean likelihood ratio test statistic for a multivariate mean as a non-conformity score. To make full use of data information, we propose to calibrate the non-conformity score using the Jackknife method or a re-sampling technique in the absence and presence of covariate shift. Our approach can flexibly integrate pre-trained statistical or machine learning models and auxiliary information defined through estimating equations. Asymptotic coverage guarantees are established for the proposed conformal prediction regions. Our simulation and real analysis indicate that compared with the existing competitors, the proposed conformal prediction regions usually have desirable coverage probabilities with smaller volumes.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"210 ","pages":"Article 105494"},"PeriodicalIF":1.4,"publicationDate":"2025-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144858275","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
André F.B. Menezes , Andrew C. Parnell , Keefe Murphy
{"title":"Finite mixture representations of zero-and-N-inflated distributions for count-compositional data","authors":"André F.B. Menezes , Andrew C. Parnell , Keefe Murphy","doi":"10.1016/j.jmva.2025.105492","DOIUrl":"10.1016/j.jmva.2025.105492","url":null,"abstract":"<div><div>We provide novel probabilistic portrayals of two multivariate models designed to handle zero-inflation in count-compositional data. We develop a new unifying framework that represents both as finite mixture distributions. One of these distributions, based on Dirichlet-multinomial components, has been studied before, but has not yet been properly characterised as a sampling distribution of the counts. The other, based on multinomial components, is a new contribution. Using our finite mixture representations enables us to derive key statistical properties, including moments, marginal distributions, and special cases for both distributions. We develop enhanced Bayesian inference schemes with efficient Gibbs sampling updates, wherever possible, for parameters and auxiliary variables, demonstrating improvements over existing methods in the literature. We conduct simulation studies to evaluate the efficiency of the Bayesian inference procedures and present applications to a human gut microbiome dataset to illustrate the practical utility of the proposed distributions.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"210 ","pages":"Article 105492"},"PeriodicalIF":1.4,"publicationDate":"2025-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144895898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Schrödinger bridge based deep conditional generative learning","authors":"Hanwen Huang , Manyu Huang","doi":"10.1016/j.jmva.2025.105486","DOIUrl":"10.1016/j.jmva.2025.105486","url":null,"abstract":"<div><div>Conditional generative models represent a significant advancement in machine learning, enabling controlled data synthesis by incorporating additional information into the generation process. In this work, we introduce a novel Schrödinger bridge-based deep generative method for learning conditional distributions. Our approach begins with a unit-time diffusion process governed by a stochastic differential equation (SDE) that evolves a fixed point at time <span><math><mrow><mi>t</mi><mo>=</mo><mn>0</mn></mrow></math></span> into a desired target conditional distribution at <span><math><mrow><mi>t</mi><mo>=</mo><mn>1</mn></mrow></math></span>. For effective implementation, we discretize the SDE using the Euler–Maruyama method, estimating the drift term nonparametrically with a deep neural network. We apply our method to both low-dimensional and high-dimensional conditional generation tasks. Numerical studies show that, although our method does not directly provide conditional density estimation, the samples generated exhibit higher quality than those from several existing methods. Furthermore, the generated samples can be effectively used to estimate the conditional density and related statistical quantities, such as the conditional mean and conditional standard deviation.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"210 ","pages":"Article 105486"},"PeriodicalIF":1.4,"publicationDate":"2025-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144829784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Katarzyna Filipiak , Daniel Klein , Stepan Mazur , Malwina Mrowińska
{"title":"Likelihood ratio test for covariance matrix under multivariate t distribution with uncorrelated observations","authors":"Katarzyna Filipiak , Daniel Klein , Stepan Mazur , Malwina Mrowińska","doi":"10.1016/j.jmva.2025.105490","DOIUrl":"10.1016/j.jmva.2025.105490","url":null,"abstract":"<div><div>In this paper, estimators for the unknown parameters under two types of matrix-variate <span><math><mi>t</mi></math></span> distributions are determined, and their basic statistical properties, including bias and sufficiency, are investigated. These estimators are then applied to test hypotheses concerning the covariance structure of a multivariate <span><math><mi>t</mi></math></span> distribution associated with a collection of uncorrelated, though not necessarily independent, observation vectors, using two types of matrix-variate <span><math><mi>t</mi></math></span> distributions. A likelihood ratio test is proposed, and its distributional properties under the null hypothesis are examined, assuming either a fully specified covariance matrix or one specified up to a constant. Furthermore, it is demonstrated that the asymptotic distribution for the type I matrix-variate <span><math><mi>t</mi></math></span> distribution under both hypotheses coincides with that under the normality assumption. Finally, for testing a fully specified covariance matrix, the asymptotic distribution of the likelihood ratio test statistic is determined.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"210 ","pages":"Article 105490"},"PeriodicalIF":1.4,"publicationDate":"2025-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144829785","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exact mean and covariance formulas after diagonal transformations of a multivariate normal","authors":"Rebecca Morrison , Estelle Basor","doi":"10.1016/j.jmva.2025.105489","DOIUrl":"10.1016/j.jmva.2025.105489","url":null,"abstract":"<div><div>Consider <span><math><mrow><mi>X</mi><mo>∼</mo><mi>N</mi><mrow><mo>(</mo><mi>0</mi><mo>,</mo><mi>Σ</mi><mo>)</mo></mrow></mrow></math></span> and <span><math><mrow><mi>Y</mi><mo>=</mo><mrow><mo>(</mo><msub><mrow><mi>f</mi></mrow><mrow><mn>1</mn></mrow></msub><mrow><mo>(</mo><msub><mrow><mi>X</mi></mrow><mrow><mn>1</mn></mrow></msub><mo>)</mo></mrow><mo>,</mo><msub><mrow><mi>f</mi></mrow><mrow><mn>2</mn></mrow></msub><mrow><mo>(</mo><msub><mrow><mi>X</mi></mrow><mrow><mn>2</mn></mrow></msub><mo>)</mo></mrow><mo>,</mo><mo>…</mo><mo>,</mo><msub><mrow><mi>f</mi></mrow><mrow><mi>d</mi></mrow></msub><mrow><mo>(</mo><msub><mrow><mi>X</mi></mrow><mrow><mi>d</mi></mrow></msub><mo>)</mo></mrow><mo>)</mo></mrow></mrow></math></span>. We call this a diagonal transformation of a multivariate normal. In this paper we compute exactly the mean vector and covariance matrix of the random vector <span><math><mrow><mi>Y</mi><mo>.</mo></mrow></math></span> This is done in two different ways: One approach uses a series expansion for the function <span><math><msub><mrow><mi>f</mi></mrow><mrow><mi>i</mi></mrow></msub></math></span> and the other a transform method. We compute several examples, show how the covariance entries can be estimated, and compare the theoretical results with numerical ones.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"210 ","pages":"Article 105489"},"PeriodicalIF":1.4,"publicationDate":"2025-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144829783","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Samuel Perreault , Yanbo Tang , Ruyi Pan , Nancy Reid
{"title":"Inference for overparametrized hierarchical Archimedean copulas","authors":"Samuel Perreault , Yanbo Tang , Ruyi Pan , Nancy Reid","doi":"10.1016/j.jmva.2025.105483","DOIUrl":"10.1016/j.jmva.2025.105483","url":null,"abstract":"<div><div>Hierarchical Archimedean copulas (HACs) are multivariate uniform distributions constructed by nesting Archimedean copulas into one another, and provide a flexible approach to modeling non-exchangeable data. However, this flexibility in the model structure may lead to over-fitting when the model estimation procedure is not performed properly. In this paper, we examine the problem of structure estimation and more generally on the selection of a parsimonious model from the hypothesis testing perspective. Formal tests for structural hypotheses concerning HACs have been lacking so far, most likely due to the restrictions on their associated parameter space which hinders the use of standard inference methodology. Building on previously developed asymptotic methods for these non-standard parameter spaces, we provide an asymptotic stochastic representation for the maximum likelihood estimators of (potentially) overparametrized HACs, which we then use to formulate a likelihood ratio test for certain common structural hypotheses. Additionally, we also derive analytical expressions for the first- and second-order partial derivatives of two-level HACs based on Clayton and Gumbel generators, as well as general numerical approximation schemes for the Fisher information matrix.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"210 ","pages":"Article 105483"},"PeriodicalIF":1.4,"publicationDate":"2025-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144840792","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhanced HSIC for independence test via projection integration","authors":"Zhimei Li , Tianxuan Ding , Tingyou Zhou , Yaowu Zhang","doi":"10.1016/j.jmva.2025.105485","DOIUrl":"10.1016/j.jmva.2025.105485","url":null,"abstract":"<div><div>Among the various measures of dependence between two random vectors, the Hilbert–Schmidt independence criterion (HSIC) is widely recognized and has gained significant attention in recent years. However, HSIC-based tests can become less effective as dimensionality increases and nonlinear dependencies become more complex. In this paper, we introduce a novel method that integrates the HSIC with a Gaussian kernel over all one-dimensional projections. The resulting metric has a closed-form expression, is non-negative, and equals zero if and only if the random vectors are independent. We estimate the integrated HSIC using <span><math><mi>U</mi></math></span>-statistic theory and analyze its asymptotic properties under the null hypothesis and two types of alternative hypotheses. Comprehensive numerical studies demonstrate that our method preserves the advantages of HSIC in univariate settings while effectively capturing complex nonlinear dependencies as dimensionality increases.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"210 ","pages":"Article 105485"},"PeriodicalIF":1.4,"publicationDate":"2025-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144780260","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jordan G. Bryan , Jonathan Niles-Weed , Peter D. Hoff
{"title":"The multirank likelihood for semiparametric canonical correlation analysis","authors":"Jordan G. Bryan , Jonathan Niles-Weed , Peter D. Hoff","doi":"10.1016/j.jmva.2025.105484","DOIUrl":"10.1016/j.jmva.2025.105484","url":null,"abstract":"<div><div>Many analyses of multivariate data focus on evaluating the dependence between two sets of variables, rather than the dependence among individual variables within each set. Canonical correlation analysis (CCA) is a classical data analysis technique that estimates parameters describing the dependence between such sets. However, inference procedures based on traditional CCA rely on the assumption that all variables are jointly normally distributed. We present a semiparametric approach to CCA in which the multivariate margins of each variable set may be arbitrary, but the dependence between variable sets is described by a parametric model that provides low-dimensional summaries of dependence. While maximum likelihood estimation in the proposed model is intractable, we propose two estimation strategies: one using a pseudolikelihood for the model and one using a Markov chain Monte Carlo (MCMC) algorithm that provides Bayesian estimates and confidence regions for the between-set dependence parameters. The MCMC algorithm is derived from a multirank likelihood function, which uses only part of the information in the observed data in exchange for being free of assumptions about the multivariate margins. We apply the proposed Bayesian inference procedure to Brazilian climate data and monthly stock returns from the materials and communications market sectors.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"210 ","pages":"Article 105484"},"PeriodicalIF":1.4,"publicationDate":"2025-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144748754","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Estimators for multivariate allometric regression model","authors":"Koji Tsukuda , Shun Matsuura","doi":"10.1016/j.jmva.2025.105482","DOIUrl":"10.1016/j.jmva.2025.105482","url":null,"abstract":"<div><div>In a regression model with multiple response variables and multiple explanatory variables, if the difference of the mean vectors of the response variables for different values of explanatory variables is always in the direction of the first principal eigenvector of the covariance matrix of the response variables, then it is called a multivariate allometric regression model. This paper studies the estimation of the first principal eigenvector in the multivariate allometric regression model. A class of estimators that includes conventional estimators is proposed based on weighted sum-of-squares matrices of regression sum-of-squares matrix and residual sum-of-squares matrix. We establish an upper bound of the mean squared error of the estimators contained in this class, and the weight value minimizing the upper bound is derived. Sufficient conditions for the consistency of the estimators are discussed in weak identifiability regimes under which the difference of the largest and second largest eigenvalues of the covariance matrix decays asymptotically and in “large <span><math><mi>p</mi></math></span>, large <span><math><mi>n</mi></math></span>” regimes, where <span><math><mi>p</mi></math></span> is the number of response variables and <span><math><mi>n</mi></math></span> is the sample size. Several numerical results are also presented.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"210 ","pages":"Article 105482"},"PeriodicalIF":1.4,"publicationDate":"2025-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144714168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On properties of fractional posterior in generalized reduced-rank regression","authors":"The Tien Mai","doi":"10.1016/j.jmva.2025.105481","DOIUrl":"10.1016/j.jmva.2025.105481","url":null,"abstract":"<div><div>Reduced rank regression (RRR) is a widely employed model for investigating the linear association between multiple response variables and a set of predictors. While RRR has been extensively explored in various works, the focus has predominantly been on continuous response variables, overlooking other types of outcomes. This study shifts its attention to the Bayesian perspective of generalized linear models (GLM) within the RRR framework. In this work, we relax the requirement for the link function of the generalized linear model to be canonical. We examine the properties of fractional posteriors in GLM within the RRR context, where a fractional power of the likelihood is utilized. By employing a spectral scaled Student prior distribution, we establish consistency and concentration results for the fractional posterior. Our results highlight adaptability, as they do not necessitate prior knowledge of the rank of the parameter matrix. These results are in line with those found in frequentist literature. We also investigate the impact of model misspecification, demonstrating the robustness of our approach in such cases. Numerical simulations and real data analyses are conducted to illustrate the promising performance of our approach compared to the state-of-the-art method.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"210 ","pages":"Article 105481"},"PeriodicalIF":1.4,"publicationDate":"2025-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144687416","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}