小样本环境下GWAS汇总统计对共享遗传结构的准确检测

IF 3.7 2区生物学 Q1 Agricultural and Biological Sciences

PLoS Genetics Pub Date : 2023-07-01 DOI:10.1101/2022.10.13.512103

Thomas W. Willis, C. Wallace

{"title":"小样本环境下GWAS汇总统计对共享遗传结构的准确检测","authors":"Thomas W. Willis, C. Wallace","doi":"10.1101/2022.10.13.512103","DOIUrl":null,"url":null,"abstract":"Assessment of the genetic similarity between two phenotypes can provide insight into a common genetic aetiology and inform the use of pleiotropy-informed, cross-phenotype analytical methods to identify novel genetic associations. The genetic correlation is a well-known means of quantifying and testing for genetic similarity between traits, but its estimates are subject to comparatively large sampling error. This makes it unsuitable for use in a small-sample context. We discuss the use of a nonparametric test of genetic similarity first introduced by Li et al. for application to GWAS summary statistics. We establish that the null distribution of the test statistic is modelled better by an extreme value distribution than a transformation of the standard exponential distribution as originally recommended by Li and colleagues. We show with simulation studies and real data from GWAS of 18 phenotypes from the UK Biobank that the test is to be preferred for use with small sample sizes, particularly when genetic effects are few and large, outperforming the genetic correlation and another nonparametric statistical test of independence. We find the test suitable for the detection of genetic similarity in the rare disease context. Author summary The genome-wide association study (GWAS) is a method used to identify genetic variants which contribute to the risk of developing disease. These genetic variants are frequently shared between conditions, such that the study of the genetic basis of one disease can be informed by knowledge of another, similar disease. This approach can be productive where the disease in question is rare such that a GWAS has less power to associate variants with the disease, but there exist larger GWAS of similar diseases. Existing methods do not measure genetic similarity precisely when patients are few. Here we assess a previously published method of testing for genetic similarity between pairs of diseases using GWAS data, the ‘GPS’ test, against three other methods with the use of real and simulated data. We present a new computational procedure for carrying out the test and show that the GPS test is superior to its comparators in identifying genetic similarity when the sample size is small and when the genetic similarity signal is less strong. Use of the test will enable accurate detection of genetic similarity and the study of rarer conditions using data from better-characterised diseases.","PeriodicalId":20266,"journal":{"name":"PLoS Genetics","volume":" ","pages":""},"PeriodicalIF":3.7000,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Accurate detection of shared genetic architecture from GWAS summary statistics in the small-sample context\",\"authors\":\"Thomas W. Willis, C. Wallace\",\"doi\":\"10.1101/2022.10.13.512103\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Assessment of the genetic similarity between two phenotypes can provide insight into a common genetic aetiology and inform the use of pleiotropy-informed, cross-phenotype analytical methods to identify novel genetic associations. The genetic correlation is a well-known means of quantifying and testing for genetic similarity between traits, but its estimates are subject to comparatively large sampling error. This makes it unsuitable for use in a small-sample context. We discuss the use of a nonparametric test of genetic similarity first introduced by Li et al. for application to GWAS summary statistics. We establish that the null distribution of the test statistic is modelled better by an extreme value distribution than a transformation of the standard exponential distribution as originally recommended by Li and colleagues. We show with simulation studies and real data from GWAS of 18 phenotypes from the UK Biobank that the test is to be preferred for use with small sample sizes, particularly when genetic effects are few and large, outperforming the genetic correlation and another nonparametric statistical test of independence. We find the test suitable for the detection of genetic similarity in the rare disease context. Author summary The genome-wide association study (GWAS) is a method used to identify genetic variants which contribute to the risk of developing disease. These genetic variants are frequently shared between conditions, such that the study of the genetic basis of one disease can be informed by knowledge of another, similar disease. This approach can be productive where the disease in question is rare such that a GWAS has less power to associate variants with the disease, but there exist larger GWAS of similar diseases. Existing methods do not measure genetic similarity precisely when patients are few. Here we assess a previously published method of testing for genetic similarity between pairs of diseases using GWAS data, the ‘GPS’ test, against three other methods with the use of real and simulated data. We present a new computational procedure for carrying out the test and show that the GPS test is superior to its comparators in identifying genetic similarity when the sample size is small and when the genetic similarity signal is less strong. Use of the test will enable accurate detection of genetic similarity and the study of rarer conditions using data from better-characterised diseases.\",\"PeriodicalId\":20266,\"journal\":{\"name\":\"PLoS Genetics\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":3.7000,\"publicationDate\":\"2023-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"PLoS Genetics\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1101/2022.10.13.512103\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"Agricultural and Biological Sciences\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"PLoS Genetics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1101/2022.10.13.512103","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Agricultural and Biological Sciences","Score":null,"Total":0}

引用次数: 0

摘要

评估两种表型之间的遗传相似性可以深入了解常见的遗传病因，并为使用多效性知情的跨表型分析方法来识别新的遗传关联提供信息。遗传相关性是量化和测试性状之间遗传相似性的一种众所周知的方法，但其估计值存在较大的抽样误差。这使得它不适合在小样本上下文中使用。我们讨论了李等人首次提出的遗传相似性的非参数检验在GWAS汇总统计中的应用。我们确定，与李及其同事最初建议的标准指数分布的转换相比，通过极值分布更好地模拟了检验统计量的零分布。我们通过模拟研究和英国生物库18种表型的GWAS的真实数据表明，该测试更适合用于小样本量的情况，特别是当遗传效应很少和很大时，优于遗传相关性和另一种非参数独立性统计测试。我们发现该测试适用于罕见病背景下的基因相似性检测。作者总结全基因组关联研究（GWAS）是一种用于识别导致疾病风险的遗传变异的方法。这些基因变异经常在不同的疾病之间共享，因此对一种疾病的遗传基础的研究可以通过对另一种类似疾病的了解来进行。这种方法在所述疾病罕见的情况下是有效的，因此GWAS将变体与疾病联系起来的能力较小，但存在更大的类似疾病的GWAS。现有的方法并不能在患者较少的情况下精确测量基因相似性。在这里，我们评估了之前发表的一种使用GWAS数据测试成对疾病之间基因相似性的方法，即“GPS”测试，以及使用真实和模拟数据的其他三种方法。我们提出了一种新的计算方法来进行测试，并表明当样本量较小和遗传相似性信号较弱时，GPS测试在识别遗传相似性方面优于其比较器。该测试的使用将能够准确检测基因相似性，并利用特征更好的疾病数据研究罕见疾病。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Accurate detection of shared genetic architecture from GWAS summary statistics in the small-sample context

Assessment of the genetic similarity between two phenotypes can provide insight into a common genetic aetiology and inform the use of pleiotropy-informed, cross-phenotype analytical methods to identify novel genetic associations. The genetic correlation is a well-known means of quantifying and testing for genetic similarity between traits, but its estimates are subject to comparatively large sampling error. This makes it unsuitable for use in a small-sample context. We discuss the use of a nonparametric test of genetic similarity first introduced by Li et al. for application to GWAS summary statistics. We establish that the null distribution of the test statistic is modelled better by an extreme value distribution than a transformation of the standard exponential distribution as originally recommended by Li and colleagues. We show with simulation studies and real data from GWAS of 18 phenotypes from the UK Biobank that the test is to be preferred for use with small sample sizes, particularly when genetic effects are few and large, outperforming the genetic correlation and another nonparametric statistical test of independence. We find the test suitable for the detection of genetic similarity in the rare disease context. Author summary The genome-wide association study (GWAS) is a method used to identify genetic variants which contribute to the risk of developing disease. These genetic variants are frequently shared between conditions, such that the study of the genetic basis of one disease can be informed by knowledge of another, similar disease. This approach can be productive where the disease in question is rare such that a GWAS has less power to associate variants with the disease, but there exist larger GWAS of similar diseases. Existing methods do not measure genetic similarity precisely when patients are few. Here we assess a previously published method of testing for genetic similarity between pairs of diseases using GWAS data, the ‘GPS’ test, against three other methods with the use of real and simulated data. We present a new computational procedure for carrying out the test and show that the GPS test is superior to its comparators in identifying genetic similarity when the sample size is small and when the genetic similarity signal is less strong. Use of the test will enable accurate detection of genetic similarity and the study of rarer conditions using data from better-characterised diseases.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

PLoS Genetics 生物-遗传学

CiteScore

8.10

自引率

2.20%

发文量

438

审稿时长

1 months

期刊介绍： PLOS Genetics is run by an international Editorial Board, headed by the Editors-in-Chief, Greg Barsh (HudsonAlpha Institute of Biotechnology, and Stanford University School of Medicine) and Greg Copenhaver (The University of North Carolina at Chapel Hill). Articles published in PLOS Genetics are archived in PubMed Central and cited in PubMed.