Xihao Li, Han Chen, Margaret Sunitha Selvaraj, Eric Van Buren, Hufeng Zhou, Yuxuan Wang, Ryan Sun, Zachary R. McCaw, Zhi Yu, Min-Zhi Jiang, Daniel DiCorpo, Sheila M. Gaynor, Rounak Dey, Donna K. Arnett, Emelia J. Benjamin, Joshua C. Bis, John Blangero, Eric Boerwinkle, Donald W. Bowden, Jennifer A. Brody, Brian E. Cade, April P. Carson, Jenna C. Carlson, Nathalie Chami, Yii-Der Ida Chen, Joanne E. Curran, Paul S. de Vries, Myriam Fornage, Nora Franceschini, Barry I. Freedman, Charles Gu, Nancy L. Heard-Costa, Jiang He, Lifang Hou, Yi-Jen Hung, Marguerite R. Irvin, Robert C. Kaplan, Sharon L. R. Kardia, Tanika N. Kelly, Iain Konigsberg, Charles Kooperberg, Brian G. Kral, Changwei Li, Yun Li, Honghuang Lin, Ching-Ti Liu, Ruth J. F. Loos, Michael C. Mahaney, Lisa W. Martin, Rasika A. Mathias, Braxton D. Mitchell, May E. Montasser, Alanna C. Morrison, Take Naseri, Kari E. North, Nicholette D. Palmer, Patricia A. Peyser, Bruce M. Psaty, Susan Redline, Alexander P. Reiner, Stephen S. Rich, Colleen M. Sitlani, Jennifer A. Smith, Kent D. Taylor, Hemant K. Tiwari, Ramachandran S. Vasan, Satupa’itea Viali, Zhe Wang, Jennifer Wessel, Lisa R. Yanek, Bing Yu, NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium, Josée Dupuis, James B. Meigs, Paul L. Auer, Laura M. Raffield, Alisa K. Manning, Kenneth M. Rice, Jerome I. Rotter, Gina M. Peloso, Pradeep Natarajan, Zilin Li, Zhonghua Liu, Xihong Lin
{"title":"A statistical framework for multi-trait rare variant analysis in large-scale whole-genome sequencing studies","authors":"Xihao Li, Han Chen, Margaret Sunitha Selvaraj, Eric Van Buren, Hufeng Zhou, Yuxuan Wang, Ryan Sun, Zachary R. McCaw, Zhi Yu, Min-Zhi Jiang, Daniel DiCorpo, Sheila M. Gaynor, Rounak Dey, Donna K. Arnett, Emelia J. Benjamin, Joshua C. Bis, John Blangero, Eric Boerwinkle, Donald W. Bowden, Jennifer A. Brody, Brian E. Cade, April P. Carson, Jenna C. Carlson, Nathalie Chami, Yii-Der Ida Chen, Joanne E. Curran, Paul S. de Vries, Myriam Fornage, Nora Franceschini, Barry I. Freedman, Charles Gu, Nancy L. Heard-Costa, Jiang He, Lifang Hou, Yi-Jen Hung, Marguerite R. Irvin, Robert C. Kaplan, Sharon L. R. Kardia, Tanika N. Kelly, Iain Konigsberg, Charles Kooperberg, Brian G. Kral, Changwei Li, Yun Li, Honghuang Lin, Ching-Ti Liu, Ruth J. F. Loos, Michael C. Mahaney, Lisa W. Martin, Rasika A. Mathias, Braxton D. Mitchell, May E. Montasser, Alanna C. Morrison, Take Naseri, Kari E. North, Nicholette D. Palmer, Patricia A. Peyser, Bruce M. Psaty, Susan Redline, Alexander P. Reiner, Stephen S. Rich, Colleen M. Sitlani, Jennifer A. Smith, Kent D. Taylor, Hemant K. Tiwari, Ramachandran S. Vasan, Satupa’itea Viali, Zhe Wang, Jennifer Wessel, Lisa R. Yanek, Bing Yu, NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium, Josée Dupuis, James B. Meigs, Paul L. Auer, Laura M. Raffield, Alisa K. Manning, Kenneth M. Rice, Jerome I. Rotter, Gina M. Peloso, Pradeep Natarajan, Zilin Li, Zhonghua Liu, Xihong Lin","doi":"10.1038/s43588-024-00764-8","DOIUrl":null,"url":null,"abstract":"Large-scale whole-genome sequencing (WGS) studies have improved our understanding of the contributions of coding and noncoding rare variants to complex human traits. Leveraging association effect sizes across multiple traits in WGS rare variant association analysis can improve statistical power over single-trait analysis, and also detect pleiotropic genes and regions. Existing multi-trait methods have limited ability to perform rare variant analysis of large-scale WGS data. We propose MultiSTAAR, a statistical framework and computationally scalable analytical pipeline for functionally informed multi-trait rare variant analysis in large-scale WGS studies. MultiSTAAR accounts for relatedness, population structure and correlation among phenotypes by jointly analyzing multiple traits, and further empowers rare variant association analysis by incorporating multiple functional annotations. We applied MultiSTAAR to jointly analyze three lipid traits in 61,838 multi-ethnic samples from the Trans-Omics for Precision Medicine (TOPMed) Program. We discovered and replicated new associations with lipid traits missed by single-trait analysis. MultiSTAAR provides a general and flexible statistical framework for functionally informed multi-trait rare variant analysis of biobank-scale sequencing studies by jointly analyzing multiple traits and incorporating annotation information.","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"5 2","pages":"125-143"},"PeriodicalIF":12.0000,"publicationDate":"2025-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nature computational science","FirstCategoryId":"1085","ListUrlMain":"https://www.nature.com/articles/s43588-024-00764-8","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
Large-scale whole-genome sequencing (WGS) studies have improved our understanding of the contributions of coding and noncoding rare variants to complex human traits. Leveraging association effect sizes across multiple traits in WGS rare variant association analysis can improve statistical power over single-trait analysis, and also detect pleiotropic genes and regions. Existing multi-trait methods have limited ability to perform rare variant analysis of large-scale WGS data. We propose MultiSTAAR, a statistical framework and computationally scalable analytical pipeline for functionally informed multi-trait rare variant analysis in large-scale WGS studies. MultiSTAAR accounts for relatedness, population structure and correlation among phenotypes by jointly analyzing multiple traits, and further empowers rare variant association analysis by incorporating multiple functional annotations. We applied MultiSTAAR to jointly analyze three lipid traits in 61,838 multi-ethnic samples from the Trans-Omics for Precision Medicine (TOPMed) Program. We discovered and replicated new associations with lipid traits missed by single-trait analysis. MultiSTAAR provides a general and flexible statistical framework for functionally informed multi-trait rare variant analysis of biobank-scale sequencing studies by jointly analyzing multiple traits and incorporating annotation information.