Kernel-Based Measure of Variable Importance for Genetic Association Studies.

IF 1.2 4区数学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY

International Journal of Biostatistics Pub Date : 2017-06-17 DOI:10.1515/ijb-2016-0087

Vicente Gallego, M Luz Calle, Ramon Oller

{"title":"Kernel-Based Measure of Variable Importance for Genetic Association Studies.","authors":"Vicente Gallego, M Luz Calle, Ramon Oller","doi":"10.1515/ijb-2016-0087","DOIUrl":null,"url":null,"abstract":"<p><p>The identification of genetic variants that are associated with disease risk is an important goal of genetic association studies. Standard approaches perform univariate analysis where each genetic variant, usually Single Nucleotide Polymorphisms (SNPs), is tested for association with disease status. Though many genetic variants have been identified and validated so far using this univariate approach, for most complex diseases a large part of their genetic component is still unknown, the so called missing heritability. We propose a Kernel-based measure of variable importance (KVI) that provides the contribution of a SNP, or a group of SNPs, to the joint genetic effect of a set of genetic variants. KVI can be used for ranking genetic markers individually, sets of markers that form blocks of linkage disequilibrium or sets of genetic variants that lie in a gene or a genetic pathway. We prove that, unlike the univariate analysis, KVI captures the relationship with other genetic variants in the analysis, even when measured at the individual level for each genetic variable separately. This is specially relevant and powerful for detecting genetic interactions. We illustrate the results with data from an Alzheimer's disease study and show through simulations that the rankings based on KVI improve those rankings based on two measures of importance provided by the Random Forest. We also prove with a simulation study that KVI is very powerful for detecting genetic interactions.</p>","PeriodicalId":49058,"journal":{"name":"International Journal of Biostatistics","volume":"13 2","pages":""},"PeriodicalIF":1.2000,"publicationDate":"2017-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/ijb-2016-0087","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Biostatistics","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1515/ijb-2016-0087","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

The identification of genetic variants that are associated with disease risk is an important goal of genetic association studies. Standard approaches perform univariate analysis where each genetic variant, usually Single Nucleotide Polymorphisms (SNPs), is tested for association with disease status. Though many genetic variants have been identified and validated so far using this univariate approach, for most complex diseases a large part of their genetic component is still unknown, the so called missing heritability. We propose a Kernel-based measure of variable importance (KVI) that provides the contribution of a SNP, or a group of SNPs, to the joint genetic effect of a set of genetic variants. KVI can be used for ranking genetic markers individually, sets of markers that form blocks of linkage disequilibrium or sets of genetic variants that lie in a gene or a genetic pathway. We prove that, unlike the univariate analysis, KVI captures the relationship with other genetic variants in the analysis, even when measured at the individual level for each genetic variable separately. This is specially relevant and powerful for detecting genetic interactions. We illustrate the results with data from an Alzheimer's disease study and show through simulations that the rankings based on KVI improve those rankings based on two measures of importance provided by the Random Forest. We also prove with a simulation study that KVI is very powerful for detecting genetic interactions.

查看原文本刊更多论文

遗传关联研究中基于核的变量重要度测度。

鉴定与疾病风险相关的遗传变异是遗传关联研究的一个重要目标。标准方法执行单变量分析，其中每个遗传变异，通常是单核苷酸多态性(snp)，测试与疾病状态的关联。尽管迄今为止已经使用这种单变量方法识别和验证了许多遗传变异，但对于大多数复杂疾病，其遗传成分的很大一部分仍然未知，即所谓的缺失遗传性。我们提出了一种基于核的变量重要性(KVI)测量方法，该方法提供了一个SNP或一组SNP对一组遗传变异的联合遗传效应的贡献。KVI可用于对单个遗传标记、形成连锁不平衡块的标记集或位于基因或遗传途径中的遗传变异集进行排序。我们证明，与单变量分析不同，KVI在分析中捕获了与其他遗传变异的关系，即使在每个遗传变量单独测量的个体水平上也是如此。这对于检测基因相互作用尤其重要。我们用阿尔茨海默病研究的数据来说明结果，并通过模拟表明，基于KVI的排名改进了基于随机森林提供的两个重要性度量的排名。我们还通过模拟研究证明，KVI在检测遗传相互作用方面非常强大。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Journal of Biostatistics MATHEMATICAL & COMPUTATIONAL BIOLOGY-STATISTICS & PROBABILITY

CiteScore

2.10

自引率

8.30%

发文量

审稿时长

>12 weeks

期刊介绍： The International Journal of Biostatistics (IJB) seeks to publish new biostatistical models and methods, new statistical theory, as well as original applications of statistical methods, for important practical problems arising from the biological, medical, public health, and agricultural sciences with an emphasis on semiparametric methods. Given many alternatives to publish exist within biostatistics, IJB offers a place to publish for research in biostatistics focusing on modern methods, often based on machine-learning and other data-adaptive methodologies, as well as providing a unique reading experience that compels the author to be explicit about the statistical inference problem addressed by the paper. IJB is intended that the journal cover the entire range of biostatistics, from theoretical advances to relevant and sensible translations of a practical problem into a statistical framework. Electronic publication also allows for data and software code to be appended, and opens the door for reproducible research allowing readers to easily replicate analyses described in a paper. Both original research and review articles will be warmly received, as will articles applying sound statistical methods to practical problems.