{"title":"Explicit bivariate simplicial depth","authors":"Erik Mendroš, Stanislav Nagy","doi":"10.1016/j.jmva.2024.105375","DOIUrl":"10.1016/j.jmva.2024.105375","url":null,"abstract":"<div><div>The simplicial depth (SD) is a celebrated tool defining elements of nonparametric and robust statistics for multivariate data. While many properties of SD are well-established, and its applications are abundant, explicit expressions for SD are known only for a handful of the simplest multivariate probability distributions. This paper deals with SD in the plane. It (i) develops a one-dimensional integral formula for SD of any properly continuous probability distribution, (ii) gives exact explicit expressions for SD of uniform distributions on (both convex and non-convex) polygons in the plane or on the boundaries of such polygons, and (iii) discusses several implications of these findings to probability and statistics: (a) An upper bound on the maximum SD in the plane, (b) an implication for a test of symmetry of a bivariate distribution, and (c) a connection of SD with the classical Sylvester problem from geometric probability.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"205 ","pages":"Article 105375"},"PeriodicalIF":1.4,"publicationDate":"2024-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142428480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Large sample correlation matrices with unbounded spectrum","authors":"Yanpeng Li","doi":"10.1016/j.jmva.2024.105373","DOIUrl":"10.1016/j.jmva.2024.105373","url":null,"abstract":"<div><div>In this paper, we demonstrate that the diagonal of a high-dimensional sample covariance matrix stemming from <span><math><mi>n</mi></math></span> independent observations of a <span><math><mi>p</mi></math></span>-dimensional time series with finite fourth moments can be approximated in spectral norm by the diagonal of the population covariance matrix regardless of the spectral norm of the population covariance matrix. Our assumptions involve <span><math><mi>p</mi></math></span> and <span><math><mi>n</mi></math></span> tending to infinity, with <span><math><mrow><mi>p</mi><mo>/</mo><mi>n</mi></mrow></math></span> tending to a constant which might be positive or zero. Consequently, we investigate the asymptotic properties of the sample correlation matrix with a divergent spectrum, and we explore its applications by deriving the limiting spectral distribution for its eigenvalues and analyzing the convergence of divergent and non-divergent spiked eigenvalues under a generalized spiked correlation framework.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"205 ","pages":"Article 105373"},"PeriodicalIF":1.4,"publicationDate":"2024-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142428479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tim Kutta , Agnieszka Jach , Michel Ferreira Cardia Haddad , Piotr Kokoszka , Haonan Wang
{"title":"Detection and localization of changes in a panel of densities","authors":"Tim Kutta , Agnieszka Jach , Michel Ferreira Cardia Haddad , Piotr Kokoszka , Haonan Wang","doi":"10.1016/j.jmva.2024.105374","DOIUrl":"10.1016/j.jmva.2024.105374","url":null,"abstract":"<div><div>We propose a new methodology for identifying and localizing changes in the Fréchet mean of a multivariate time series of probability densities. The functional data objects we study are random densities <span><math><msub><mrow><mi>f</mi></mrow><mrow><mi>s</mi><mo>,</mo><mi>t</mi></mrow></msub></math></span> indexed by discrete time <span><math><mi>t</mi></math></span> and a vector component <span><math><mi>s</mi></math></span>, which can be treated as a broadly understood spatial location. Our main objective is to identify the set of components <span><math><mi>s</mi></math></span>, where a change occurs with statistical certainty. A challenge of this analysis is that the densities <span><math><msub><mrow><mi>f</mi></mrow><mrow><mi>s</mi><mo>,</mo><mi>t</mi></mrow></msub></math></span> are not directly observable and must be estimated from sparse and potentially imbalanced data. Such setups are motivated by the analysis of two data sets that we investigate in this work. First, a hitherto unpublished large data set of Brazilian Covid infections and a second, a financial data set derived from intraday prices of U.S. Exchange Traded Funds. Chief statistical advances are the development of change point tests and estimators of components of change for multivariate time series of densities. We prove the theoretical validity of our methodology and investigate its finite sample performance in a simulation study.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"205 ","pages":"Article 105374"},"PeriodicalIF":1.4,"publicationDate":"2024-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142327131","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Data depth functions for non-standard data by use of formal concept analysis","authors":"Hannah Blocher, Georg Schollmeyer","doi":"10.1016/j.jmva.2024.105372","DOIUrl":"10.1016/j.jmva.2024.105372","url":null,"abstract":"<div><div>In this article we introduce a notion of depth functions for data types that are not given in standard statistical data formats. We focus on data that cannot be represented by one specific data structure, such as normed vector spaces. This covers a wide range of different data types, which we refer to as non-standard data. Depth functions have been studied intensively for normed vector spaces. However, a discussion of depth functions for non-standard data is lacking. In this article, we address this gap by using formal concept analysis to obtain a unified data representation. Building on this representation, we then define depth functions for non-standard data. Furthermore, we provide a systematic basis by introducing structural properties using the data representation provided by formal concept analysis. Finally, we embed the generalised Tukey depth into our concept of data depth and analyse it using the introduced structural properties. Thus, this article presents the mathematical formalisation of centrality and outlyingness for non-standard data and increases the number of spaces in which centrality can be discussed. In particular, we provide a basis for defining further depth functions and statistical inference methods for non-standard data.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"205 ","pages":"Article 105372"},"PeriodicalIF":1.4,"publicationDate":"2024-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142327125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Scaled envelope models for multivariate time series","authors":"H.M. Wiranthe B. Herath , S. Yaser Samadi","doi":"10.1016/j.jmva.2024.105370","DOIUrl":"10.1016/j.jmva.2024.105370","url":null,"abstract":"<div><p>Vector autoregressive (VAR) models have become a popular choice for modeling multivariate time series data due to their simplicity and ease of use. Efficient estimation of VAR coefficients is an important problem. The envelope technique for VAR models is demonstrated to have the potential to yield significant gains in efficiency and accuracy by incorporating linear combinations of the response vector that are essentially immaterial to the estimation of the VAR coefficients. However, inferences based on envelope VAR (EVAR) models are not invariant or equivariant upon the rescaling of the VAR responses, limiting their application to time series data that are measured in the same or similar units. In scenarios where VAR responses are measured on different scales, the efficiency improvements promised by envelopes are not always guaranteed. To address this limitation, we introduce the scaled envelope VAR (SEVAR) model, which preserves the efficiency-boosting capabilities of standard envelope techniques while remaining invariant to scale changes. The asymptotic characteristics of the proposed estimators are established based on different error assumptions. Simulation studies and real-data analysis are conducted to demonstrate the efficiency and effectiveness of the proposed model. The numerical results corroborate our theoretical findings.</p></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"205 ","pages":"Article 105370"},"PeriodicalIF":1.4,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0047259X24000770/pdfft?md5=bcb10a9c98d350b55789c52bc615d145&pid=1-s2.0-S0047259X24000770-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142240404","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A bias-corrected Srivastava-type test for cross-sectional independence","authors":"Kai Xu , Mingxiang Cao , Qing Cheng","doi":"10.1016/j.jmva.2024.105371","DOIUrl":"10.1016/j.jmva.2024.105371","url":null,"abstract":"<div><p>This paper proposes a test for cross-sectional independence with high dimensional panel data. It uses the random matrix theory based approach of Srivastava (2005) in the presence of a large number of cross-sectional units and time series observations. Because the errors are unobservable, the residuals from the regression model for panel data are used. We develop a bias-corrected test after adjusting for the contribution from the regressors. With the aid of the martingale central limit theorem, we prove that the limiting null distribution of the proposed test statistic is normal under mild conditions as cross-sectional dimension and time dimension go to infinity together. We further study the asymptotic relative efficiency of our proposed test with respect to the state-of-art Lagrange multiplier test. An interesting finding is that the newly proposed test can have substantial power gain when the underlying variance magnitudes are not identical across different units.</p></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"205 ","pages":"Article 105371"},"PeriodicalIF":1.4,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0047259X24000782/pdfft?md5=792309b6f97ca51742555998cfec1771&pid=1-s2.0-S0047259X24000782-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142240405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Invariant correlation under marginal transforms","authors":"Takaaki Koike , Liyuan Lin , Ruodu Wang","doi":"10.1016/j.jmva.2024.105361","DOIUrl":"10.1016/j.jmva.2024.105361","url":null,"abstract":"<div><p>A useful property of independent samples is that their correlation remains the same after applying marginal transforms. This invariance property plays a fundamental role in statistical inference, but does not hold in general for dependent samples. In this paper, we study this invariance property on the Pearson correlation coefficient and its applications. A multivariate random vector is said to have an invariant correlation if its pairwise correlation coefficients remain unchanged under any common marginal transforms. For a bivariate case, we characterize all models of such a random vector via a certain combination of comonotonicity—the strongest form of positive dependence—and independence. In particular, we show that the class of exchangeable copulas with invariant correlation is precisely described by what we call positive Fréchet copulas. In the general multivariate case, we characterize the set of all invariant correlation matrices via the clique partition polytope. We also propose a positive regression dependent model that admits any prescribed invariant correlation matrix. Finally, we show that all our characterization results of invariant correlation, except one special case, remain the same if the common marginal transforms are confined to the set of increasing ones.</p></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"204 ","pages":"Article 105361"},"PeriodicalIF":1.4,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0047259X2400068X/pdfft?md5=87348d6db627c38f7dec7cb4cd435464&pid=1-s2.0-S0047259X2400068X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142122196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Grouped feature screening for ultrahigh-dimensional classification via Gini distance correlation","authors":"Yongli Sang , Xin Dang","doi":"10.1016/j.jmva.2024.105360","DOIUrl":"10.1016/j.jmva.2024.105360","url":null,"abstract":"<div><p>Gini distance correlation (GDC) was recently proposed to measure the dependence between a categorical variable, <span><math><mi>Y</mi></math></span>, and a numerical random vector, <span><math><mi>X</mi></math></span>. It mutually characterizes independence between <span><math><mi>X</mi></math></span> and <span><math><mi>Y</mi></math></span>. In this article, we utilize the GDC to establish a feature screening for ultrahigh-dimensional discriminant analysis where the response variable is categorical. It can be used for screening individual features as well as grouped features. The proposed procedure possesses several appealing properties. It is model-free. No model specification is needed. It holds the sure independence screening property and the ranking consistency property. The proposed screening method can also deal with the case that the response has divergent number of categories. We conduct several Monte Carlo simulation studies to examine the finite sample performance of the proposed screening procedure. Real data analysis for two real life datasets are illustrated.</p></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"204 ","pages":"Article 105360"},"PeriodicalIF":1.4,"publicationDate":"2024-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142088010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The inner partial least square: An exploration of the “necessary” dimension reduction","authors":"Yunjian Yin, Lan Liu","doi":"10.1016/j.jmva.2024.105356","DOIUrl":"10.1016/j.jmva.2024.105356","url":null,"abstract":"<div><p>The partial least square (PLS) algorithm retains the combinations of predictors that maximize the covariance with the outcome. Cook et al. (2013) showed that PLS results in a predictor envelope, which is the smallest reducing subspace of predictors’ covariance that contains the coefficient. However, PLS and predictor envelope both target at a space that contains the regression coefficients and therefore they may sometimes be too conservative to reduce the dimension of the predictors. In this paper, we propose a new method that may improve the estimation efficiency of regression coefficients when both PLS and predictor envelope fail to do so. Specifically, our method results in the largest reducing subspace of predictors’ covariance that is contained in the coefficient matrix space. Interestingly, the moment based algorithm of our proposed method can be achieved by changing the max in PLS to min. We define the modified PLS as the inner PLS and the resulting space as the inner predictor envelope space. We provide the theoretical properties of our proposed methods as well as demonstrate their use in China Health and Nutrition Survey.</p></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"204 ","pages":"Article 105356"},"PeriodicalIF":1.4,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142083637","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Cross projection test for mean vectors via multiple random splits in high dimensions","authors":"Guanpeng Wang , Jiujing Wu , Hengjian Cui","doi":"10.1016/j.jmva.2024.105358","DOIUrl":"10.1016/j.jmva.2024.105358","url":null,"abstract":"<div><p>The cross projection test (CPT) technique is extended to high-dimensional two-sample mean tests in this article, which was first proposed by Wang and Cui (2024). A data-splitting strategy is required to find the projection directions that reduce the data from high dimensional space to low dimensional space which can well solve the issue of “the curse of dimensionality”. As long as both samples are randomly split once, two correlated cross projection statistics can be established according to the CPT development mechanism, which is similar to all constructed test statistics that exist the correlation caused by multiple random splits. To deal with this issue and improve the performance of empirical powers by eliminating the randomness of data-splitting, we further utilize a powerful Cauchy combination test algorithm based on multiple data-splitting. Theoretically, we prove the asymptotic property of the proposed test statistic. Furthermore, for the sparse alternative case, we apply the power enhancement technique to the ensemble Cauchy combination test-based algorithm in marginal screening for the full data. Numerical studies through Monte Carlo simulations and two real data examples are conducted simultaneously to illustrate the utility of our proposed ensemble algorithm.</p></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"204 ","pages":"Article 105358"},"PeriodicalIF":1.4,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142020744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}