{"title":"Integrating Multiple Data Sources with Interactions in Multi-Omics Using Cooperative Learning","authors":"Matteo D'Alessandro, Theophilus Quachie Asenso, Manuela Zucknick","doi":"arxiv-2409.07125","DOIUrl":null,"url":null,"abstract":"Modeling with multi-omics data presents multiple challenges such as the\nhigh-dimensionality of the problem ($p \\gg n$), the presence of interactions\nbetween features, and the need for integration between multiple data sources.\nWe establish an interaction model that allows for the inclusion of multiple\nsources of data from the integration of two existing methods, pliable lasso and\ncooperative learning. The integrated model is tested both on simulation studies\nand on real multi-omics datasets for predicting labor onset and cancer\ntreatment response. The results show that the model is effective in modeling\nmulti-source data in various scenarios where interactions are present, both in\nterms of prediction performance and selection of relevant variables.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"195 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - STAT - Methodology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.07125","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Modeling with multi-omics data presents multiple challenges such as the
high-dimensionality of the problem ($p \gg n$), the presence of interactions
between features, and the need for integration between multiple data sources.
We establish an interaction model that allows for the inclusion of multiple
sources of data from the integration of two existing methods, pliable lasso and
cooperative learning. The integrated model is tested both on simulation studies
and on real multi-omics datasets for predicting labor onset and cancer
treatment response. The results show that the model is effective in modeling
multi-source data in various scenarios where interactions are present, both in
terms of prediction performance and selection of relevant variables.