{"title":"Transfer Learning in Genome-Wide Association Studies with Knockoffs.","authors":"Shuangning Li, Zhimei Ren, Chiara Sabatti, Matteo Sesia","doi":"10.1007/s13571-022-00297-y","DOIUrl":null,"url":null,"abstract":"<p><p>This paper presents and compares alternative transfer learning methods that can increase the power of conditional testing via knockoffs by leveraging prior information in external data sets collected from different populations or measuring relatedoutcomes. The relevance of this methodology is explored in particular within the context of genome-wide association studies, where it can be helpful to address the pressing need for principled ways to suitably account for, and efficiently learn from the genetic variation associated to diverse ancestries. Finally, we apply these methods to analyze several phenotypes in the UK Biobank data set, demonstrating that transfer learning helps knockoffs discover more associations in the data collected from minority populations, potentially opening the way to the development of more accurate polygenic risk scores.</p>","PeriodicalId":45608,"journal":{"name":"Sankhya-Series B-Applied and Interdisciplinary Statistics","volume":"1 1","pages":""},"PeriodicalIF":0.7000,"publicationDate":"2022-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12331138/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Sankhya-Series B-Applied and Interdisciplinary Statistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s13571-022-00297-y","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 0
Abstract
This paper presents and compares alternative transfer learning methods that can increase the power of conditional testing via knockoffs by leveraging prior information in external data sets collected from different populations or measuring relatedoutcomes. The relevance of this methodology is explored in particular within the context of genome-wide association studies, where it can be helpful to address the pressing need for principled ways to suitably account for, and efficiently learn from the genetic variation associated to diverse ancestries. Finally, we apply these methods to analyze several phenotypes in the UK Biobank data set, demonstrating that transfer learning helps knockoffs discover more associations in the data collected from minority populations, potentially opening the way to the development of more accurate polygenic risk scores.
期刊介绍:
Sankhya, Series A, publishes original, high quality research articles in various areas of modern statistics, such as probability, theoretical statistics, mathematical statistics and machine learning. The areas are interpreted in a broad sense. Articles are judged on the basis of their novelty and technical correctness.
Sankhya, Series B, primarily covers applied and interdisciplinary statistics including data sciences. Applied articles should preferably include analysis of original data of broad interest, novel applications of methodology and development of methods and techniques of immediate practical use. Authoritative reviews and comprehensive discussion articles in areas of vigorous current research are also welcome.