{"title":"在遗传关联研究中检测疾病-基因型关联的鲁棒方法:使用精确条件枚举而不是模拟排列或渐近近似计算p值。","authors":"Mette Langaas, Øyvind Bakke","doi":"10.1515/sagmb-2013-0084","DOIUrl":null,"url":null,"abstract":"<p><p>In genetic association studies, detecting disease-genotype association is a primary goal. We study seven robust test statistics for such association when the underlying genetic model is unknown, for data on disease status (case or control) and genotype (three genotypes of a biallelic genetic marker). In such studies, p-values have predominantly been calculated by asymptotic approximations or by simulated permutations. We consider an exact method, conditional enumeration. When the number of simulated permutations tends to infinity, the permutation p-value approaches the conditional enumeration p-value, but calculating the latter is much more efficient than performing simulated permutations. We have studied case-control sample sizes with 500-5000 cases and 500-15,000 controls, and significance levels from 5 × 10(-8) to 0.05, thus our results are applicable to genetic association studies with only a few genetic markers under study, intermediate follow-up studies, and genome-wide association studies. Our main findings are: (i) If all monotone genetic models are of interest, the best performance in the situations under study is achieved for the robust test statistics based on the maximum over a range of Cochran-Armitage trend tests with different scores and for the constrained likelihood ratio test. (ii) For significance levels below 0.05, for the test statistics under study, asymptotic approximations may give a test size up to 20 times the nominal level, and should therefore be used with caution. (iii) Calculating p-values based on exact conditional enumeration is a powerful, valid and computationally feasible approach, and we advocate its use in genetic association studies.</p>","PeriodicalId":0,"journal":{"name":"","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/sagmb-2013-0084","citationCount":"6","resultStr":"{\"title\":\"Robust methods to detect disease-genotype association in genetic association studies: calculate p-values using exact conditional enumeration instead of simulated permutations or asymptotic approximations.\",\"authors\":\"Mette Langaas, Øyvind Bakke\",\"doi\":\"10.1515/sagmb-2013-0084\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>In genetic association studies, detecting disease-genotype association is a primary goal. We study seven robust test statistics for such association when the underlying genetic model is unknown, for data on disease status (case or control) and genotype (three genotypes of a biallelic genetic marker). In such studies, p-values have predominantly been calculated by asymptotic approximations or by simulated permutations. We consider an exact method, conditional enumeration. When the number of simulated permutations tends to infinity, the permutation p-value approaches the conditional enumeration p-value, but calculating the latter is much more efficient than performing simulated permutations. We have studied case-control sample sizes with 500-5000 cases and 500-15,000 controls, and significance levels from 5 × 10(-8) to 0.05, thus our results are applicable to genetic association studies with only a few genetic markers under study, intermediate follow-up studies, and genome-wide association studies. Our main findings are: (i) If all monotone genetic models are of interest, the best performance in the situations under study is achieved for the robust test statistics based on the maximum over a range of Cochran-Armitage trend tests with different scores and for the constrained likelihood ratio test. (ii) For significance levels below 0.05, for the test statistics under study, asymptotic approximations may give a test size up to 20 times the nominal level, and should therefore be used with caution. (iii) Calculating p-values based on exact conditional enumeration is a powerful, valid and computationally feasible approach, and we advocate its use in genetic association studies.</p>\",\"PeriodicalId\":0,\"journal\":{\"name\":\"\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0,\"publicationDate\":\"2014-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1515/sagmb-2013-0084\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.1515/sagmb-2013-0084\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1515/sagmb-2013-0084","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Robust methods to detect disease-genotype association in genetic association studies: calculate p-values using exact conditional enumeration instead of simulated permutations or asymptotic approximations.
In genetic association studies, detecting disease-genotype association is a primary goal. We study seven robust test statistics for such association when the underlying genetic model is unknown, for data on disease status (case or control) and genotype (three genotypes of a biallelic genetic marker). In such studies, p-values have predominantly been calculated by asymptotic approximations or by simulated permutations. We consider an exact method, conditional enumeration. When the number of simulated permutations tends to infinity, the permutation p-value approaches the conditional enumeration p-value, but calculating the latter is much more efficient than performing simulated permutations. We have studied case-control sample sizes with 500-5000 cases and 500-15,000 controls, and significance levels from 5 × 10(-8) to 0.05, thus our results are applicable to genetic association studies with only a few genetic markers under study, intermediate follow-up studies, and genome-wide association studies. Our main findings are: (i) If all monotone genetic models are of interest, the best performance in the situations under study is achieved for the robust test statistics based on the maximum over a range of Cochran-Armitage trend tests with different scores and for the constrained likelihood ratio test. (ii) For significance levels below 0.05, for the test statistics under study, asymptotic approximations may give a test size up to 20 times the nominal level, and should therefore be used with caution. (iii) Calculating p-values based on exact conditional enumeration is a powerful, valid and computationally feasible approach, and we advocate its use in genetic association studies.