Xihao Li, Han Chen, Margaret Sunitha Selvaraj, Eric Van Buren, Hufeng Zhou, Yuxuan Wang, Ryan Sun, Zachary R. McCaw, Zhi Yu, Min-Zhi Jiang, Daniel DiCorpo, Sheila M. Gaynor, Rounak Dey, Donna K. Arnett, Emelia J. Benjamin, Joshua C. Bis, John Blangero, Eric Boerwinkle, Donald W. Bowden, Jennifer A. Brody, Brian E. Cade, April P. Carson, Jenna C. Carlson, Nathalie Chami, Yii-Der Ida Chen, Joanne E. Curran, Paul S. de Vries, Myriam Fornage, Nora Franceschini, Barry I. Freedman, Charles Gu, Nancy L. Heard-Costa, Jiang He, Lifang Hou, Yi-Jen Hung, Marguerite R. Irvin, Robert C. Kaplan, Sharon L. R. Kardia, Tanika N. Kelly, Iain Konigsberg, Charles Kooperberg, Brian G. Kral, Changwei Li, Yun Li, Honghuang Lin, Ching-Ti Liu, Ruth J. F. Loos, Michael C. Mahaney, Lisa W. Martin, Rasika A. Mathias, Braxton D. Mitchell, May E. Montasser, Alanna C. Morrison, Take Naseri, Kari E. North, Nicholette D. Palmer, Patricia A. Peyser, Bruce M. Psaty, Susan Redline, Alexander P. Reiner, Stephen S. Rich, Colleen M. Sitlani, Jennifer A. Smith, Kent D. Taylor, Hemant K. Tiwari, Ramachandran S. Vasan, Satupa’itea Viali, Zhe Wang, Jennifer Wessel, Lisa R. Yanek, Bing Yu, NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium, Josée Dupuis, James B. Meigs, Paul L. Auer, Laura M. Raffield, Alisa K. Manning, Kenneth M. Rice, Jerome I. Rotter, Gina M. Peloso, Pradeep Natarajan, Zilin Li, Zhonghua Liu, Xihong Lin
{"title":"大规模全基因组测序研究中多性状罕见变异分析的统计框架。","authors":"Xihao Li, Han Chen, Margaret Sunitha Selvaraj, Eric Van Buren, Hufeng Zhou, Yuxuan Wang, Ryan Sun, Zachary R. McCaw, Zhi Yu, Min-Zhi Jiang, Daniel DiCorpo, Sheila M. Gaynor, Rounak Dey, Donna K. Arnett, Emelia J. Benjamin, Joshua C. Bis, John Blangero, Eric Boerwinkle, Donald W. Bowden, Jennifer A. Brody, Brian E. Cade, April P. Carson, Jenna C. Carlson, Nathalie Chami, Yii-Der Ida Chen, Joanne E. Curran, Paul S. de Vries, Myriam Fornage, Nora Franceschini, Barry I. Freedman, Charles Gu, Nancy L. Heard-Costa, Jiang He, Lifang Hou, Yi-Jen Hung, Marguerite R. Irvin, Robert C. Kaplan, Sharon L. R. Kardia, Tanika N. Kelly, Iain Konigsberg, Charles Kooperberg, Brian G. Kral, Changwei Li, Yun Li, Honghuang Lin, Ching-Ti Liu, Ruth J. F. Loos, Michael C. Mahaney, Lisa W. Martin, Rasika A. Mathias, Braxton D. Mitchell, May E. Montasser, Alanna C. Morrison, Take Naseri, Kari E. North, Nicholette D. Palmer, Patricia A. Peyser, Bruce M. Psaty, Susan Redline, Alexander P. Reiner, Stephen S. Rich, Colleen M. Sitlani, Jennifer A. Smith, Kent D. Taylor, Hemant K. Tiwari, Ramachandran S. Vasan, Satupa’itea Viali, Zhe Wang, Jennifer Wessel, Lisa R. Yanek, Bing Yu, NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium, Josée Dupuis, James B. Meigs, Paul L. Auer, Laura M. Raffield, Alisa K. Manning, Kenneth M. Rice, Jerome I. Rotter, Gina M. Peloso, Pradeep Natarajan, Zilin Li, Zhonghua Liu, Xihong Lin","doi":"10.1038/s43588-024-00764-8","DOIUrl":null,"url":null,"abstract":"Large-scale whole-genome sequencing (WGS) studies have improved our understanding of the contributions of coding and noncoding rare variants to complex human traits. Leveraging association effect sizes across multiple traits in WGS rare variant association analysis can improve statistical power over single-trait analysis, and also detect pleiotropic genes and regions. Existing multi-trait methods have limited ability to perform rare variant analysis of large-scale WGS data. We propose MultiSTAAR, a statistical framework and computationally scalable analytical pipeline for functionally informed multi-trait rare variant analysis in large-scale WGS studies. MultiSTAAR accounts for relatedness, population structure and correlation among phenotypes by jointly analyzing multiple traits, and further empowers rare variant association analysis by incorporating multiple functional annotations. We applied MultiSTAAR to jointly analyze three lipid traits in 61,838 multi-ethnic samples from the Trans-Omics for Precision Medicine (TOPMed) Program. We discovered and replicated new associations with lipid traits missed by single-trait analysis. MultiSTAAR provides a general and flexible statistical framework for functionally informed multi-trait rare variant analysis of biobank-scale sequencing studies by jointly analyzing multiple traits and incorporating annotation information.","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"5 2","pages":"125-143"},"PeriodicalIF":12.0000,"publicationDate":"2025-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A statistical framework for multi-trait rare variant analysis in large-scale whole-genome sequencing studies\",\"authors\":\"Xihao Li, Han Chen, Margaret Sunitha Selvaraj, Eric Van Buren, Hufeng Zhou, Yuxuan Wang, Ryan Sun, Zachary R. McCaw, Zhi Yu, Min-Zhi Jiang, Daniel DiCorpo, Sheila M. Gaynor, Rounak Dey, Donna K. Arnett, Emelia J. Benjamin, Joshua C. Bis, John Blangero, Eric Boerwinkle, Donald W. Bowden, Jennifer A. Brody, Brian E. Cade, April P. Carson, Jenna C. Carlson, Nathalie Chami, Yii-Der Ida Chen, Joanne E. Curran, Paul S. de Vries, Myriam Fornage, Nora Franceschini, Barry I. Freedman, Charles Gu, Nancy L. Heard-Costa, Jiang He, Lifang Hou, Yi-Jen Hung, Marguerite R. Irvin, Robert C. Kaplan, Sharon L. R. Kardia, Tanika N. Kelly, Iain Konigsberg, Charles Kooperberg, Brian G. Kral, Changwei Li, Yun Li, Honghuang Lin, Ching-Ti Liu, Ruth J. F. Loos, Michael C. Mahaney, Lisa W. Martin, Rasika A. Mathias, Braxton D. Mitchell, May E. Montasser, Alanna C. Morrison, Take Naseri, Kari E. North, Nicholette D. Palmer, Patricia A. Peyser, Bruce M. Psaty, Susan Redline, Alexander P. Reiner, Stephen S. Rich, Colleen M. Sitlani, Jennifer A. Smith, Kent D. Taylor, Hemant K. Tiwari, Ramachandran S. Vasan, Satupa’itea Viali, Zhe Wang, Jennifer Wessel, Lisa R. Yanek, Bing Yu, NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium, Josée Dupuis, James B. Meigs, Paul L. Auer, Laura M. Raffield, Alisa K. Manning, Kenneth M. Rice, Jerome I. Rotter, Gina M. Peloso, Pradeep Natarajan, Zilin Li, Zhonghua Liu, Xihong Lin\",\"doi\":\"10.1038/s43588-024-00764-8\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Large-scale whole-genome sequencing (WGS) studies have improved our understanding of the contributions of coding and noncoding rare variants to complex human traits. Leveraging association effect sizes across multiple traits in WGS rare variant association analysis can improve statistical power over single-trait analysis, and also detect pleiotropic genes and regions. Existing multi-trait methods have limited ability to perform rare variant analysis of large-scale WGS data. We propose MultiSTAAR, a statistical framework and computationally scalable analytical pipeline for functionally informed multi-trait rare variant analysis in large-scale WGS studies. MultiSTAAR accounts for relatedness, population structure and correlation among phenotypes by jointly analyzing multiple traits, and further empowers rare variant association analysis by incorporating multiple functional annotations. We applied MultiSTAAR to jointly analyze three lipid traits in 61,838 multi-ethnic samples from the Trans-Omics for Precision Medicine (TOPMed) Program. We discovered and replicated new associations with lipid traits missed by single-trait analysis. MultiSTAAR provides a general and flexible statistical framework for functionally informed multi-trait rare variant analysis of biobank-scale sequencing studies by jointly analyzing multiple traits and incorporating annotation information.\",\"PeriodicalId\":74246,\"journal\":{\"name\":\"Nature computational science\",\"volume\":\"5 2\",\"pages\":\"125-143\"},\"PeriodicalIF\":12.0000,\"publicationDate\":\"2025-02-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Nature computational science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.nature.com/articles/s43588-024-00764-8\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nature computational science","FirstCategoryId":"1085","ListUrlMain":"https://www.nature.com/articles/s43588-024-00764-8","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
摘要
大规模全基因组测序(WGS)研究提高了我们对编码和非编码罕见变异对复杂人类性状的贡献的理解。在WGS罕见变异关联分析中利用多性状的关联效应量可以提高单性状分析的统计能力,也可以检测到多效性基因和区域。现有的多性状方法对大规模WGS数据进行罕见变异分析的能力有限。我们提出MultiSTAAR,这是一个统计框架和计算可扩展的分析管道,用于大规模WGS研究中的功能信息多性状罕见变异分析。MultiSTAAR通过联合分析多个性状来解释表型之间的相关性、群体结构和相关性,并通过合并多个功能注释进一步增强罕见变异关联分析的能力。我们应用MultiSTAAR对来自TOPMed (Trans-Omics for Precision Medicine)项目的61,838个多民族样本的3个脂质性状进行了联合分析。我们发现并复制了与单性状分析遗漏的脂质性状的新关联。
A statistical framework for multi-trait rare variant analysis in large-scale whole-genome sequencing studies
Large-scale whole-genome sequencing (WGS) studies have improved our understanding of the contributions of coding and noncoding rare variants to complex human traits. Leveraging association effect sizes across multiple traits in WGS rare variant association analysis can improve statistical power over single-trait analysis, and also detect pleiotropic genes and regions. Existing multi-trait methods have limited ability to perform rare variant analysis of large-scale WGS data. We propose MultiSTAAR, a statistical framework and computationally scalable analytical pipeline for functionally informed multi-trait rare variant analysis in large-scale WGS studies. MultiSTAAR accounts for relatedness, population structure and correlation among phenotypes by jointly analyzing multiple traits, and further empowers rare variant association analysis by incorporating multiple functional annotations. We applied MultiSTAAR to jointly analyze three lipid traits in 61,838 multi-ethnic samples from the Trans-Omics for Precision Medicine (TOPMed) Program. We discovered and replicated new associations with lipid traits missed by single-trait analysis. MultiSTAAR provides a general and flexible statistical framework for functionally informed multi-trait rare variant analysis of biobank-scale sequencing studies by jointly analyzing multiple traits and incorporating annotation information.