基于结构化正则化的高维典型相关分析。

IF 1.2 4区数学 Q2 STATISTICS & PROBABILITY

Statistical Modelling Pub Date : 2023-06-01 DOI:10.1177/1471082x211041033

Elena Tuzhilina, Leonardo Tozzi, Trevor Hastie

{"title":"基于结构化正则化的高维典型相关分析。","authors":"Elena Tuzhilina, Leonardo Tozzi, Trevor Hastie","doi":"10.1177/1471082x211041033","DOIUrl":null,"url":null,"abstract":"Canonical correlation analysis (CCA) is a technique for measuring the association between two multivariate data matrices. A regularized modification of canonical correlation analysis (RCCA) which imposes an ℓ2 penalty on the CCA coefficients is widely used in applications with high-dimensional data. One limitation of such regularization is that it ignores any data structure, treating all the features equally, which can be ill-suited for some applications. In this article we introduce several approaches to regularizing CCA that take the underlying data structure into account. In particular, the proposed group regularized canonical correlation analysis (GRCCA) is useful when the variables are correlated in groups. We illustrate some computational strategies to avoid excessive computations with regularized CCA in high dimensions. We demonstrate the application of these methods in our motivating application from neuroscience, as well as in a small simulation example.","PeriodicalId":49476,"journal":{"name":"Statistical Modelling","volume":"23 3","pages":"203-227"},"PeriodicalIF":1.2000,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10274416/pdf/nihms-1834734.pdf","citationCount":"3","resultStr":"{\"title\":\"Canonical correlation analysis in high dimensions with structured regularization.\",\"authors\":\"Elena Tuzhilina, Leonardo Tozzi, Trevor Hastie\",\"doi\":\"10.1177/1471082x211041033\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Canonical correlation analysis (CCA) is a technique for measuring the association between two multivariate data matrices. A regularized modification of canonical correlation analysis (RCCA) which imposes an ℓ2 penalty on the CCA coefficients is widely used in applications with high-dimensional data. One limitation of such regularization is that it ignores any data structure, treating all the features equally, which can be ill-suited for some applications. In this article we introduce several approaches to regularizing CCA that take the underlying data structure into account. In particular, the proposed group regularized canonical correlation analysis (GRCCA) is useful when the variables are correlated in groups. We illustrate some computational strategies to avoid excessive computations with regularized CCA in high dimensions. We demonstrate the application of these methods in our motivating application from neuroscience, as well as in a small simulation example.\",\"PeriodicalId\":49476,\"journal\":{\"name\":\"Statistical Modelling\",\"volume\":\"23 3\",\"pages\":\"203-227\"},\"PeriodicalIF\":1.2000,\"publicationDate\":\"2023-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10274416/pdf/nihms-1834734.pdf\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Statistical Modelling\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.1177/1471082x211041033\",\"RegionNum\":4,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"STATISTICS & PROBABILITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistical Modelling","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1177/1471082x211041033","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}

引用次数: 3

摘要

典型相关分析(CCA)是一种测量两个多变量数据矩阵之间关联的技术。典型相关分析(RCCA)的正则化修正在典型相关分析系数上施加一个l2惩罚，被广泛应用于高维数据的应用。这种正则化的一个限制是它忽略任何数据结构，平等地对待所有特征，这可能不适合某些应用程序。在本文中，我们将介绍几种考虑底层数据结构的正则化CCA的方法。特别是，所提出的组正则化典型相关分析(GRCCA)在变量在组中相关时非常有用。我们举例说明了一些计算策略，以避免在高维正则化CCA中过度计算。我们演示了这些方法在神经科学的激励应用中的应用，以及一个小的模拟示例。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Canonical correlation analysis in high dimensions with structured regularization.

Canonical correlation analysis (CCA) is a technique for measuring the association between two multivariate data matrices. A regularized modification of canonical correlation analysis (RCCA) which imposes an ℓ₂ penalty on the CCA coefficients is widely used in applications with high-dimensional data. One limitation of such regularization is that it ignores any data structure, treating all the features equally, which can be ill-suited for some applications. In this article we introduce several approaches to regularizing CCA that take the underlying data structure into account. In particular, the proposed group regularized canonical correlation analysis (GRCCA) is useful when the variables are correlated in groups. We illustrate some computational strategies to avoid excessive computations with regularized CCA in high dimensions. We demonstrate the application of these methods in our motivating application from neuroscience, as well as in a small simulation example.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Statistical Modelling 数学-统计学与概率论

CiteScore

2.20

自引率

0.00%

发文量

审稿时长

>12 weeks

期刊介绍： The primary aim of the journal is to publish original and high-quality articles that recognize statistical modelling as the general framework for the application of statistical ideas. Submissions must reflect important developments, extensions, and applications in statistical modelling. The journal also encourages submissions that describe scientifically interesting, complex or novel statistical modelling aspects from a wide diversity of disciplines, and submissions that embrace the diversity of applied statistical modelling.