{"title":"Linearized maximum rank correlation estimation when covariates are functional","authors":"Wenchao Xu , Xinyu Zhang , Hua Liang","doi":"10.1016/j.jmva.2024.105301","DOIUrl":"https://doi.org/10.1016/j.jmva.2024.105301","url":null,"abstract":"<div><p>This paper extends the linearized maximum rank correlation (LMRC) estimation proposed by Shen et al. (2023) to the setting where the covariate is a function. However, this extension is nontrivial due to the difficulty of inverting the covariance operator, which may raise the ill-posed inverse problem, for which we integrate the functional principal component analysis to the LMRC procedure. The proposed estimator is robust to outliers in response and computationally efficient. We establish the rate of convergence of the proposed estimator, which is minimax optimal under certain smoothness assumptions. Furthermore, we extend the proposed estimation procedure to handle discretely observed functional covariates, including both sparse and dense sampling designs, and establish the corresponding rate of convergence. Simulation studies demonstrate that the proposed estimators outperform the other existing methods for some examples. Finally, we apply our method to a real data to illustrate its usefulness.</p></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"202 ","pages":"Article 105301"},"PeriodicalIF":1.6,"publicationDate":"2024-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139975931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Variable selection in multivariate regression models with measurement error in covariates","authors":"Jingyu Cui , Grace Y. Yi","doi":"10.1016/j.jmva.2024.105299","DOIUrl":"https://doi.org/10.1016/j.jmva.2024.105299","url":null,"abstract":"<div><p>Multivariate regression models have been broadly used in analyzing data having multi-dimensional response variables. The use of such models is, however, impeded by the presence of measurement error and spurious variables. While data with such features are common in applications, there has been little work available concerning these features jointly. In this article, we consider variable selection under multivariate regression models with covariates subject to measurement error. To gain flexibility, we allow the dimensions of the covariate and response variables to be either fixed or diverging as the sample size increases. A new regularized method is proposed to handle both variable selection and measurement error effects for error-contaminated data. Our proposed penalized bias-corrected least squares method offers flexibility in selecting the penalty function from a class of functions with different features. Importantly, our method does not require full distributional assumptions for the associated variables, thereby broadening its applicability. We rigorously establish theoretical results and describe a computationally efficient procedure for the proposed method. Numerical studies confirm the satisfactory performance of the proposed method under finite settings, and also demonstrate deleterious effects of ignoring measurement error in inferential procedures.</p></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"202 ","pages":"Article 105299"},"PeriodicalIF":1.6,"publicationDate":"2024-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139998890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Joni Virta , Niko Lietzén , Lauri Viitasaari , Pauliina Ilmonen
{"title":"Latent model extreme value index estimation","authors":"Joni Virta , Niko Lietzén , Lauri Viitasaari , Pauliina Ilmonen","doi":"10.1016/j.jmva.2024.105300","DOIUrl":"https://doi.org/10.1016/j.jmva.2024.105300","url":null,"abstract":"<div><p>We propose a novel strategy for multivariate extreme value index estimation. In applications such as finance, volatility and risk of multivariate time series are often driven by the same underlying factors. To estimate the latent risks, we apply a two-stage procedure. First, a set of independent latent series is estimated using a method of latent variable analysis. Then, univariate risk measures are estimated individually for the latent series. We provide conditions under which the effect of the latent model estimation to the asymptotic behavior of the risk estimators is negligible. Simulations illustrate the theory under both i.i.d. and dependent data, and an application into currency exchange rate data shows that the method is able to discover extreme behavior not found by component-wise analysis of the original series.</p></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"202 ","pages":"Article 105300"},"PeriodicalIF":1.6,"publicationDate":"2024-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0047259X24000071/pdfft?md5=ce435e68e1036f63dbf20ffeb86fc426&pid=1-s2.0-S0047259X24000071-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139743276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Estimation of multiple networks with common structures in heterogeneous subgroups","authors":"Xing Qin , Jianhua Hu , Shuangge Ma , Mengyun Wu","doi":"10.1016/j.jmva.2024.105298","DOIUrl":"https://doi.org/10.1016/j.jmva.2024.105298","url":null,"abstract":"<div><p>Network estimation has been a critical component of high-dimensional data analysis and can provide an understanding of the underlying complex dependence structures. Among the existing studies, Gaussian graphical models have been highly popular. However, they still have limitations due to the homogeneous distribution assumption and the fact that they are only applicable to small-scale data. For example, cancers have various levels of unknown heterogeneity, and biological networks, which include thousands of molecular components, often differ across subgroups while also sharing some commonalities. In this article, we propose a new joint estimation approach for multiple networks with unknown sample heterogeneity, by decomposing the Gaussian graphical model (GGM) into a collection of sparse regression problems. A reparameterization technique and a composite minimax concave penalty are introduced to effectively accommodate the specific and common information across the networks of multiple subgroups, making the proposed estimator significantly advancing from the existing heterogeneity network analysis based on the regularized likelihood of GGM directly and enjoying scale-invariant, tuning-insensitive, and optimization convexity properties. The proposed analysis can be effectively realized using parallel computing. The estimation and selection consistency properties are rigorously established. The proposed approach allows the theoretical studies to focus on independent network estimation only and has the significant advantage of being both theoretically and computationally applicable to large-scale data. Extensive numerical experiments with simulated data and the TCGA breast cancer data demonstrate the prominent performance of the proposed approach in both subgroup and network identifications.</p></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"202 ","pages":"Article 105298"},"PeriodicalIF":1.6,"publicationDate":"2024-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139749242","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A data depth based nonparametric test of independence between two random vectors","authors":"Sakineh Dehghan, Mohammad Reza Faridrohani","doi":"10.1016/j.jmva.2024.105297","DOIUrl":"https://doi.org/10.1016/j.jmva.2024.105297","url":null,"abstract":"<div><p>A new family of depth-based test statistics is proposed for testing the hypothesis of independence between two random vectors. In the procedure to derive the asymptotic distribution of the tests under the null hypothesis, we do not require any symmetric assumption of the distribution functions. Furthermore, a conditional distribution-free property of the tests is shown. The asymptotic relative efficiency of the tests is discussed under the class of elliptically symmetric distribution. Asymptotic relative efficiencies along with Monte Carlo results suggest that the performance of the proposed class is comparable to the existing ones, and under some circumstances, it has higher power. Finally, we apply the tests to two real data sets and also discuss the robustness of our tests.</p></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"202 ","pages":"Article 105297"},"PeriodicalIF":1.6,"publicationDate":"2024-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139718894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Convergence analysis of data augmentation algorithms for Bayesian robust multivariate linear regression with incomplete data","authors":"Haoxiang Li, Qian Qin, Galin L. Jones","doi":"10.1016/j.jmva.2024.105296","DOIUrl":"10.1016/j.jmva.2024.105296","url":null,"abstract":"<div><p><span>Gaussian mixtures are commonly used for modeling heavy-tailed error distributions in robust linear regression. Combining the likelihood of a multivariate robust linear regression model with a standard improper prior distribution yields an analytically intractable posterior distribution<span> that can be sampled using a data augmentation algorithm. When the response matrix has missing entries, there are unique challenges to the application and analysis of the convergence properties of the algorithm. Conditions for geometric </span></span>ergodicity<span> are provided when the incomplete data have a “monotone” structure. In the absence of a monotone structure, an intermediate imputation step is necessary for implementing the algorithm. In this case, we provide sufficient conditions for the algorithm to be Harris ergodic. Finally, we show that, when there is a monotone structure and intermediate imputation is unnecessary, intermediate imputation slows the convergence of the underlying Monte Carlo Markov chain, while post hoc imputation does not. An R package for the data augmentation algorithm is provided.</span></p></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"202 ","pages":"Article 105296"},"PeriodicalIF":1.6,"publicationDate":"2024-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139509069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On positive association of absolute-valued and squared multivariate Gaussians beyond MTP2","authors":"Helmut Finner , Markus Roters","doi":"10.1016/j.jmva.2024.105295","DOIUrl":"10.1016/j.jmva.2024.105295","url":null,"abstract":"<div><p>We show that positively associated squared (and absolute-valued) multivariate normally distributed random vectors need not be multivariate totally positive of order 2 (MTP<sub>2</sub>) for <span><math><mrow><mi>p</mi><mo>≥</mo><mn>3</mn></mrow></math></span>. This result disproves Theorem 1 in Eisenbaum (2014, Ann. Probab.) and the conjecture that positive association of squared multivariate normals is equivalent to MTP<sub>2</sub> and infinite divisibility of squared multivariate normals. Among others, we show that there exist absolute-valued multivariate normals which are conditionally increasing in sequence (CIS) (or weakly CIS (WCIS)) and hence positively associated but not MTP<sub>2</sub>. Moreover, we show that there exist absolute-valued multivariate normals which are positively associated but not CIS. As a by-product, we obtain necessary conditions for CIS and WCIS of absolute normals. We illustrate these conditions in some examples. With respect to implications and applications of our results, we show PA beyond MTP<sub>2</sub> for some related multivariate distributions (chi-square, <span><math><mi>t</mi></math></span>, skew normal) and refer to possible conservative multiple test procedures and conservative simultaneous confidence bounds. Finally, we obtain the validity of the strong form of Gaussian product inequalities beyond MTP<sub>2</sub>.</p></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"202 ","pages":"Article 105295"},"PeriodicalIF":1.6,"publicationDate":"2024-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0047259X24000022/pdfft?md5=057f93b4d2894763a0a0d8fd83de8805&pid=1-s2.0-S0047259X24000022-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139461529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Estimation of sparse covariance matrix via non-convex regularization","authors":"Xin Wang , Lingchen Kong , Liqun Wang","doi":"10.1016/j.jmva.2024.105294","DOIUrl":"10.1016/j.jmva.2024.105294","url":null,"abstract":"<div><p>Estimation of high-dimensional sparse covariance matrix is one of the fundamental and important problems in multivariate analysis and has a wide range of applications in many fields. This paper presents a novel method for sparse covariance matrix estimation via solving a non-convex regularization optimization problem. We establish the asymptotic properties of the proposed estimator and develop a multi-stage convex relaxation method to find an effective estimator. The multi-stage convex relaxation method guarantees any accumulation point of the sequence generated is a first-order stationary point of the non-convex optimization. Moreover, the error bounds of the first two stage estimators of the multi-stage convex relaxation method are derived under some regularity conditions. The numerical results show that our estimator outperforms the state-of-the-art estimators and has a high degree of sparsity on the premise of its effectiveness.</p></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"202 ","pages":"Article 105294"},"PeriodicalIF":1.6,"publicationDate":"2024-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139101885","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hypothesis testing for mean vector and covariance matrix of multi-populations under a two-step monotone incomplete sample in large sample and dimension","authors":"Shin-ichi Tsukada","doi":"10.1016/j.jmva.2023.105290","DOIUrl":"10.1016/j.jmva.2023.105290","url":null,"abstract":"<div><p><span>In this study, we focus on the critical issue of analyzing data sets with missing data. Statistically processing such data sets, particularly those with general missing data, is challenging to express in explicit formulae, and often requires computational algorithms to solve. We specifically address monotone missing data, which are the simplest form of data sets with missing data. We conduct hypothesis tests to determine the equivalence of mean vectors and covariance matrices across different populations. Furthermore, we derive the properties of </span>likelihood ratio test statistics in scenarios involving large samples and large dimensions.</p></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"202 ","pages":"Article 105290"},"PeriodicalIF":1.6,"publicationDate":"2023-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139068375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Elena Di Bernardino , Thomas Laloë , Cambyse Pakzad
{"title":"Estimation of extreme multivariate expectiles with functional covariates","authors":"Elena Di Bernardino , Thomas Laloë , Cambyse Pakzad","doi":"10.1016/j.jmva.2023.105292","DOIUrl":"10.1016/j.jmva.2023.105292","url":null,"abstract":"<div><p><span>The present article is devoted to the semi-parametric estimation of multivariate expectiles for extreme levels. The considered multivariate risk measures also include the possible conditioning with respect to a functional covariate<span><span>, belonging to an infinite-dimensional space. By using the first order optimality condition, we interpret these expectiles as solutions of a multidimensional nonlinear optimum problem. Then the inference is based on a minimization algorithm of gradient descent type, coupled with consistent kernel estimations of our key statistical quantities such as conditional </span>quantiles, conditional </span></span>tail index<span><span> and conditional tail dependence functions. The method is valid for equivalently heavy-tailed marginals and under a multivariate regular variation condition on the underlying unknown random vector with arbitrary dependence structure. Our main result establishes the consistency in </span>probability<span> of the optimum approximated solution vectors with a speed rate. This allows us to estimate the global computational cost of the whole procedure according to the data sample size. The finite-sample performance of our methodology is provided via a numerical illustration of simulated datasets.</span></span></p></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"202 ","pages":"Article 105292"},"PeriodicalIF":1.6,"publicationDate":"2023-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139031522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}