Armin Rauschenberger, Petr V Nazarov, Enrico Glaab
{"title":"Estimating sparse regression models in multi-task learning and transfer learning through adaptive penalisation.","authors":"Armin Rauschenberger, Petr V Nazarov, Enrico Glaab","doi":"10.1093/bioinformatics/btaf406","DOIUrl":null,"url":null,"abstract":"<p><strong>Method: </strong>Here, we propose a simple two-stage procedure for sharing information between related high-dimensional prediction or classification problems. In both stages, we perform sparse regression separately for each problem. While this is done without prior information in the first stage, we use the coefficients from the first stage as prior information for the second stage. Specifically, we designed feature-specific and sign-specific adaptive weights to share information on feature selection, effect directions, and effect sizes between different problems.</p><p><strong>Results: </strong>The proposed approach is applicable to multi-task learning as well as transfer learning. It provides sparse models (i.e. with few non-zero coefficients for each problem) that are easy to interpret. We show by simulation and application that it tends to select fewer features while achieving a similar predictive performance as compared to available methods.</p><p><strong>Availability and implementation: </strong>An implementation is available in the R package \"sparselink\" (https://github.com/rauschenberger/sparselink, https://cran.r-project.org/package=sparselink).</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":5.4000,"publicationDate":"2025-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12502914/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics (Oxford, England)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioinformatics/btaf406","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Method: Here, we propose a simple two-stage procedure for sharing information between related high-dimensional prediction or classification problems. In both stages, we perform sparse regression separately for each problem. While this is done without prior information in the first stage, we use the coefficients from the first stage as prior information for the second stage. Specifically, we designed feature-specific and sign-specific adaptive weights to share information on feature selection, effect directions, and effect sizes between different problems.
Results: The proposed approach is applicable to multi-task learning as well as transfer learning. It provides sparse models (i.e. with few non-zero coefficients for each problem) that are easy to interpret. We show by simulation and application that it tends to select fewer features while achieving a similar predictive performance as compared to available methods.
Availability and implementation: An implementation is available in the R package "sparselink" (https://github.com/rauschenberger/sparselink, https://cran.r-project.org/package=sparselink).