A rapid method for combined analysis of common and rare variants at the level of a region, gene, or pathway.

Q2 Biochemistry, Genetics and Molecular Biology

Advances and Applications in Bioinformatics and Chemistry Pub Date : 2012-01-01 Epub Date: 2012-07-24 DOI:10.2147/AABC.S33049

David Curtis

{"title":"A rapid method for combined analysis of common and rare variants at the level of a region, gene, or pathway.","authors":"David Curtis","doi":"10.2147/AABC.S33049","DOIUrl":null,"url":null,"abstract":"<p><p>Previously described methods for the combined analysis of common and rare variants have disadvantages such as requiring an arbitrary classification of variants or permutation testing to assess statistical significance. Here we propose a novel method which implements a weighting scheme based on allele frequencies observed in both cases and controls. Because the test is unbiased, scores can be analyzed with a standard t-test. To test its validity we applied it to data for common, rare, and very rare variants simulated under the null hypothesis. To test its power we applied it to simulated data in which association was present, including data using the observed allele frequencies of common and rare variants in NOD2 previously reported in cases of Crohn's disease and controls. The method produced results that conformed well to those expected under the null hypothesis. It demonstrated more power to detect association when rare and common variants were analyzed jointly, the power further increasing when rare variants were assigned higher weights. 20,000 analyses of a gene containing 62 variants could be performed in 80 minutes on a laptop. This approach shows promise for the analysis of data currently emerging from genome wide sequencing studies.</p>","PeriodicalId":53584,"journal":{"name":"Advances and Applications in Bioinformatics and Chemistry","volume":"5 ","pages":"1-9"},"PeriodicalIF":0.0000,"publicationDate":"2012-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.2147/AABC.S33049","citationCount":"57","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advances and Applications in Bioinformatics and Chemistry","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2147/AABC.S33049","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2012/7/24 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"Biochemistry, Genetics and Molecular Biology","Score":null,"Total":0}

引用次数: 57

Abstract

Previously described methods for the combined analysis of common and rare variants have disadvantages such as requiring an arbitrary classification of variants or permutation testing to assess statistical significance. Here we propose a novel method which implements a weighting scheme based on allele frequencies observed in both cases and controls. Because the test is unbiased, scores can be analyzed with a standard t-test. To test its validity we applied it to data for common, rare, and very rare variants simulated under the null hypothesis. To test its power we applied it to simulated data in which association was present, including data using the observed allele frequencies of common and rare variants in NOD2 previously reported in cases of Crohn's disease and controls. The method produced results that conformed well to those expected under the null hypothesis. It demonstrated more power to detect association when rare and common variants were analyzed jointly, the power further increasing when rare variants were assigned higher weights. 20,000 analyses of a gene containing 62 variants could be performed in 80 minutes on a laptop. This approach shows promise for the analysis of data currently emerging from genome wide sequencing studies.

Abstract Image

查看原文本刊更多论文

在一个区域、基因或通路水平上对常见和罕见变异进行综合分析的快速方法。

以前描述的对常见和罕见变异进行联合分析的方法有缺点，例如需要对变异进行任意分类或排列测试来评估统计显著性。在这里，我们提出了一种新的方法，该方法实现了基于在病例和对照中观察到的等位基因频率的加权方案。因为测试是无偏的，分数可以用标准的t检验来分析。为了检验其有效性，我们将其应用于在零假设下模拟的常见、罕见和非常罕见的变量的数据。为了测试其有效性，我们将其应用于存在关联的模拟数据，包括使用先前在克罗恩病病例和对照中报道的NOD2常见和罕见变异的观察到的等位基因频率的数据。该方法产生的结果与零假设下的预期结果非常吻合。当稀有变异和常见变异同时分析时，发现关联的能力更强，当稀有变异赋予更高的权重时，能力进一步增强。一台笔记本电脑可以在80分钟内对一个包含62种变异的基因进行20000次分析。这种方法显示了对目前从全基因组测序研究中出现的数据进行分析的希望。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Advances and Applications in Bioinformatics and Chemistry Biochemistry, Genetics and Molecular Biology-Biochemistry, Genetics and Molecular Biology (miscellaneous)

CiteScore

6.50

自引率

0.00%

发文量

审稿时长

16 weeks