Jack Prothero, Meilei Jiang, Jan Hannig, Quoc Tran-Dinh, Andrew Ackerman, J. S. Marron
{"title":"Data integration via analysis of subspaces (DIVAS)","authors":"Jack Prothero, Meilei Jiang, Jan Hannig, Quoc Tran-Dinh, Andrew Ackerman, J. S. Marron","doi":"10.1007/s11749-024-00923-z","DOIUrl":null,"url":null,"abstract":"<p>Modern data collection in many data paradigms, including bioinformatics, often incorporates multiple traits derived from different data types (i.e., platforms). We call this data multi-block, multi-view, or multi-omics data. The emergent field of data integration develops and applies new methods for studying multi-block data and identifying how different data types relate and differ. One major frontier in contemporary data integration research is methodology that can identify partially shared structure between sub-collections of data types. This work presents a new approach: Data Integration Via Analysis of Subspaces (DIVAS). DIVAS combines new insights in angular subspace perturbation theory with recent developments in matrix signal processing and convex–concave optimization into one algorithm for exploring partially shared structure. Based on principal angles between subspaces, DIVAS provides built-in inference on the results of the analysis, and is effective even in high-dimension-low-sample-size (HDLSS) situations.</p>","PeriodicalId":51189,"journal":{"name":"Test","volume":"31 1","pages":""},"PeriodicalIF":1.2000,"publicationDate":"2024-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Test","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1007/s11749-024-00923-z","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 0
Abstract
Modern data collection in many data paradigms, including bioinformatics, often incorporates multiple traits derived from different data types (i.e., platforms). We call this data multi-block, multi-view, or multi-omics data. The emergent field of data integration develops and applies new methods for studying multi-block data and identifying how different data types relate and differ. One major frontier in contemporary data integration research is methodology that can identify partially shared structure between sub-collections of data types. This work presents a new approach: Data Integration Via Analysis of Subspaces (DIVAS). DIVAS combines new insights in angular subspace perturbation theory with recent developments in matrix signal processing and convex–concave optimization into one algorithm for exploring partially shared structure. Based on principal angles between subspaces, DIVAS provides built-in inference on the results of the analysis, and is effective even in high-dimension-low-sample-size (HDLSS) situations.
期刊介绍:
TEST is an international journal of Statistics and Probability, sponsored by the Spanish Society of Statistics and Operations Research. English is the official language of the journal.
The emphasis of TEST is placed on papers containing original theoretical contributions of direct or potential value in applications. In this respect, the methodological contents are considered to be crucial for the papers published in TEST, but the practical implications of the methodological aspects are also relevant. Original sound manuscripts on either well-established or emerging areas in the scope of the journal are welcome.
One volume is published annually in four issues. In addition to the regular contributions, each issue of TEST contains an invited paper from a world-wide recognized outstanding statistician on an up-to-date challenging topic, including discussions.