{"title":"Change point analysis of functional variance function with stationary error","authors":"Qirui Hu","doi":"10.1016/j.jmva.2024.105311","DOIUrl":"10.1016/j.jmva.2024.105311","url":null,"abstract":"<div><p>An asymptotically correct test for an abrupt break in functional variance function of measurement error in the functional sequence and the confidence interval of change point is constructed. Under general assumptions, the test and detection procedure conducted by Spline-backfitted kernel smoothing, i.e., recovering trajectories with B-spline and estimating variance function with kernel regression, enjoy oracle efficiency, namely, the proposed procedure is asymptotically indistinguishable from that with accurate trajectories. Furthermore, a consistent algorithm for multiple change points based on the binary segment is derived. Extensive simulation studies reveal a positive confirmation of the asymptotic theory. The proposed method is applied to analyze EEG data.</p></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140152987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On heavy-tailed risks under Gaussian copula: The effects of marginal transformation","authors":"Bikramjit Das , Vicky Fasen-Hartmann","doi":"10.1016/j.jmva.2024.105310","DOIUrl":"https://doi.org/10.1016/j.jmva.2024.105310","url":null,"abstract":"<div><p>In this paper, we compute multivariate tail risk probabilities where the marginal risks are heavy-tailed and the dependence structure is a Gaussian copula. The marginal heavy-tailed risks are modeled using regular variation which leads to a few interesting consequences. First, as the threshold increases, we note that the rate of decay of probabilities of tail sets varies depending on the type of tail sets considered and the Gaussian correlation matrix. Second, we discover that although any multivariate model with a Gaussian copula admits the so-called asymptotic tail independence property, the joint tail behavior under heavier tailed marginal variables is structurally distinct from that under Gaussian marginal variables. The results obtained are illustrated using examples and simulations.</p></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140031062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"High-dimensional nonconvex LASSO-type M-estimators","authors":"Jad Beyhum , François Portier","doi":"10.1016/j.jmva.2024.105303","DOIUrl":"https://doi.org/10.1016/j.jmva.2024.105303","url":null,"abstract":"<div><p>A theory is developed to examine the convergence properties of <span><math><msub><mrow><mi>ℓ</mi></mrow><mrow><mn>1</mn></mrow></msub></math></span>-norm penalized high-dimensional <span><math><mi>M</mi></math></span>-estimators, with nonconvex risk and unrestricted domain. Under high-level conditions, the estimators are shown to attain the rate of convergence <span><math><mrow><msub><mrow><mi>s</mi></mrow><mrow><mn>0</mn></mrow></msub><msqrt><mrow><mo>log</mo><mrow><mo>(</mo><mi>n</mi><mi>d</mi><mo>)</mo></mrow><mo>/</mo><mi>n</mi></mrow></msqrt></mrow></math></span>, where <span><math><msub><mrow><mi>s</mi></mrow><mrow><mn>0</mn></mrow></msub></math></span> is the number of nonzero coefficients of the parameter of interest. Sufficient conditions for our main assumptions are then developed and finally used in several examples including robust linear regression, binary classification and nonlinear least squares.</p></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139986808","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Nonlinear sufficient dimension reduction for distribution-on-distribution regression","authors":"Qi Zhang, Bing Li, Lingzhou Xue","doi":"10.1016/j.jmva.2024.105302","DOIUrl":"https://doi.org/10.1016/j.jmva.2024.105302","url":null,"abstract":"<div><p>We introduce a new approach to nonlinear sufficient dimension reduction in cases where both the predictor and the response are distributional data, modeled as members of a metric space. Our key step is to build universal kernels (cc-universal) on the metric spaces, which results in reproducing kernel Hilbert spaces for the predictor and response that are rich enough to characterize the conditional independence that determines sufficient dimension reduction. For univariate distributions, we construct the universal kernel using the Wasserstein distance, while for multivariate distributions, we resort to the sliced Wasserstein distance. The sliced Wasserstein distance ensures that the metric space possesses similar topological properties to the Wasserstein space, while also offering significant computation benefits. Numerical results based on synthetic data show that our method outperforms possible competing methods. The method is also applied to several data sets, including fertility and mortality data and Calgary temperature data.</p></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139986807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Linearized maximum rank correlation estimation when covariates are functional","authors":"Wenchao Xu , Xinyu Zhang , Hua Liang","doi":"10.1016/j.jmva.2024.105301","DOIUrl":"https://doi.org/10.1016/j.jmva.2024.105301","url":null,"abstract":"<div><p>This paper extends the linearized maximum rank correlation (LMRC) estimation proposed by Shen et al. (2023) to the setting where the covariate is a function. However, this extension is nontrivial due to the difficulty of inverting the covariance operator, which may raise the ill-posed inverse problem, for which we integrate the functional principal component analysis to the LMRC procedure. The proposed estimator is robust to outliers in response and computationally efficient. We establish the rate of convergence of the proposed estimator, which is minimax optimal under certain smoothness assumptions. Furthermore, we extend the proposed estimation procedure to handle discretely observed functional covariates, including both sparse and dense sampling designs, and establish the corresponding rate of convergence. Simulation studies demonstrate that the proposed estimators outperform the other existing methods for some examples. Finally, we apply our method to a real data to illustrate its usefulness.</p></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139975931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Variable selection in multivariate regression models with measurement error in covariates","authors":"Jingyu Cui , Grace Y. Yi","doi":"10.1016/j.jmva.2024.105299","DOIUrl":"https://doi.org/10.1016/j.jmva.2024.105299","url":null,"abstract":"<div><p>Multivariate regression models have been broadly used in analyzing data having multi-dimensional response variables. The use of such models is, however, impeded by the presence of measurement error and spurious variables. While data with such features are common in applications, there has been little work available concerning these features jointly. In this article, we consider variable selection under multivariate regression models with covariates subject to measurement error. To gain flexibility, we allow the dimensions of the covariate and response variables to be either fixed or diverging as the sample size increases. A new regularized method is proposed to handle both variable selection and measurement error effects for error-contaminated data. Our proposed penalized bias-corrected least squares method offers flexibility in selecting the penalty function from a class of functions with different features. Importantly, our method does not require full distributional assumptions for the associated variables, thereby broadening its applicability. We rigorously establish theoretical results and describe a computationally efficient procedure for the proposed method. Numerical studies confirm the satisfactory performance of the proposed method under finite settings, and also demonstrate deleterious effects of ignoring measurement error in inferential procedures.</p></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139998890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Joni Virta , Niko Lietzén , Lauri Viitasaari , Pauliina Ilmonen
{"title":"Latent model extreme value index estimation","authors":"Joni Virta , Niko Lietzén , Lauri Viitasaari , Pauliina Ilmonen","doi":"10.1016/j.jmva.2024.105300","DOIUrl":"https://doi.org/10.1016/j.jmva.2024.105300","url":null,"abstract":"<div><p>We propose a novel strategy for multivariate extreme value index estimation. In applications such as finance, volatility and risk of multivariate time series are often driven by the same underlying factors. To estimate the latent risks, we apply a two-stage procedure. First, a set of independent latent series is estimated using a method of latent variable analysis. Then, univariate risk measures are estimated individually for the latent series. We provide conditions under which the effect of the latent model estimation to the asymptotic behavior of the risk estimators is negligible. Simulations illustrate the theory under both i.i.d. and dependent data, and an application into currency exchange rate data shows that the method is able to discover extreme behavior not found by component-wise analysis of the original series.</p></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0047259X24000071/pdfft?md5=ce435e68e1036f63dbf20ffeb86fc426&pid=1-s2.0-S0047259X24000071-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139743276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Estimation of multiple networks with common structures in heterogeneous subgroups","authors":"Xing Qin , Jianhua Hu , Shuangge Ma , Mengyun Wu","doi":"10.1016/j.jmva.2024.105298","DOIUrl":"https://doi.org/10.1016/j.jmva.2024.105298","url":null,"abstract":"<div><p>Network estimation has been a critical component of high-dimensional data analysis and can provide an understanding of the underlying complex dependence structures. Among the existing studies, Gaussian graphical models have been highly popular. However, they still have limitations due to the homogeneous distribution assumption and the fact that they are only applicable to small-scale data. For example, cancers have various levels of unknown heterogeneity, and biological networks, which include thousands of molecular components, often differ across subgroups while also sharing some commonalities. In this article, we propose a new joint estimation approach for multiple networks with unknown sample heterogeneity, by decomposing the Gaussian graphical model (GGM) into a collection of sparse regression problems. A reparameterization technique and a composite minimax concave penalty are introduced to effectively accommodate the specific and common information across the networks of multiple subgroups, making the proposed estimator significantly advancing from the existing heterogeneity network analysis based on the regularized likelihood of GGM directly and enjoying scale-invariant, tuning-insensitive, and optimization convexity properties. The proposed analysis can be effectively realized using parallel computing. The estimation and selection consistency properties are rigorously established. The proposed approach allows the theoretical studies to focus on independent network estimation only and has the significant advantage of being both theoretically and computationally applicable to large-scale data. Extensive numerical experiments with simulated data and the TCGA breast cancer data demonstrate the prominent performance of the proposed approach in both subgroup and network identifications.</p></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139749242","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A data depth based nonparametric test of independence between two random vectors","authors":"Sakineh Dehghan, Mohammad Reza Faridrohani","doi":"10.1016/j.jmva.2024.105297","DOIUrl":"https://doi.org/10.1016/j.jmva.2024.105297","url":null,"abstract":"<div><p>A new family of depth-based test statistics is proposed for testing the hypothesis of independence between two random vectors. In the procedure to derive the asymptotic distribution of the tests under the null hypothesis, we do not require any symmetric assumption of the distribution functions. Furthermore, a conditional distribution-free property of the tests is shown. The asymptotic relative efficiency of the tests is discussed under the class of elliptically symmetric distribution. Asymptotic relative efficiencies along with Monte Carlo results suggest that the performance of the proposed class is comparable to the existing ones, and under some circumstances, it has higher power. Finally, we apply the tests to two real data sets and also discuss the robustness of our tests.</p></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139718894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Convergence analysis of data augmentation algorithms for Bayesian robust multivariate linear regression with incomplete data","authors":"Haoxiang Li, Qian Qin, Galin L. Jones","doi":"10.1016/j.jmva.2024.105296","DOIUrl":"10.1016/j.jmva.2024.105296","url":null,"abstract":"<div><p><span>Gaussian mixtures are commonly used for modeling heavy-tailed error distributions in robust linear regression. Combining the likelihood of a multivariate robust linear regression model with a standard improper prior distribution yields an analytically intractable posterior distribution<span> that can be sampled using a data augmentation algorithm. When the response matrix has missing entries, there are unique challenges to the application and analysis of the convergence properties of the algorithm. Conditions for geometric </span></span>ergodicity<span> are provided when the incomplete data have a “monotone” structure. In the absence of a monotone structure, an intermediate imputation step is necessary for implementing the algorithm. In this case, we provide sufficient conditions for the algorithm to be Harris ergodic. Finally, we show that, when there is a monotone structure and intermediate imputation is unnecessary, intermediate imputation slows the convergence of the underlying Monte Carlo Markov chain, while post hoc imputation does not. An R package for the data augmentation algorithm is provided.</span></p></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139509069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}