Grid-based Gaussian process models for longitudinal genetic data

IF 0.5 Q4 STATISTICS & PROBABILITY
Wonil Chung
{"title":"Grid-based Gaussian process models for longitudinal genetic data","authors":"Wonil Chung","doi":"10.29220/csam.2022.29.1.745","DOIUrl":null,"url":null,"abstract":"Although various statistical methods have been developed to map time-dependent genetic factors, most identified genetic variants can explain only a small portion of the estimated genetic variation in longitudinal traits. Gene-gene and gene-time / environment interactions are known to be important putative sources of the missing heritability. However, mapping epistatic gene-gene interactions is extremely di ffi cult due to the very large parameter spaces for models containing such interactions. In this paper, we develop a Gaussian process (GP) based nonparametric Bayesian variable selection method for longitudinal data. It maps multiple genetic markers without restricting to pairwise interactions. Rather than modeling each main and interaction term explicitly, the GP model measures the importance of each marker, regardless of whether it is mostly due to a main e ff ect or some interaction e ff ect(s), via an unspecified function. To improve the flexibility of the GP model, we propose a novel grid-based method for the within-subject dependence structure. The proposed method can accurately approximate complex covariance structures. The dimension of the covariance matrix depends only on the number of fixed grid points although each subject may have di ff erent numbers of measurements at di ff erent time points. The deviance information criterion (DIC) and the Bayesian predictive information criterion (BPIC) are proposed for selecting an optimal number of grid points. To e ffi ciently draw posterior samples, we combine a hybrid Monte Carlo method with a partially collapsed Gibbs (PCG) sampler. We apply the proposed GP model to a mouse dataset on age-related body weight.","PeriodicalId":44931,"journal":{"name":"Communications for Statistical Applications and Methods","volume":" ","pages":""},"PeriodicalIF":0.5000,"publicationDate":"2022-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Communications for Statistical Applications and Methods","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.29220/csam.2022.29.1.745","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 1

Abstract

Although various statistical methods have been developed to map time-dependent genetic factors, most identified genetic variants can explain only a small portion of the estimated genetic variation in longitudinal traits. Gene-gene and gene-time / environment interactions are known to be important putative sources of the missing heritability. However, mapping epistatic gene-gene interactions is extremely di ffi cult due to the very large parameter spaces for models containing such interactions. In this paper, we develop a Gaussian process (GP) based nonparametric Bayesian variable selection method for longitudinal data. It maps multiple genetic markers without restricting to pairwise interactions. Rather than modeling each main and interaction term explicitly, the GP model measures the importance of each marker, regardless of whether it is mostly due to a main e ff ect or some interaction e ff ect(s), via an unspecified function. To improve the flexibility of the GP model, we propose a novel grid-based method for the within-subject dependence structure. The proposed method can accurately approximate complex covariance structures. The dimension of the covariance matrix depends only on the number of fixed grid points although each subject may have di ff erent numbers of measurements at di ff erent time points. The deviance information criterion (DIC) and the Bayesian predictive information criterion (BPIC) are proposed for selecting an optimal number of grid points. To e ffi ciently draw posterior samples, we combine a hybrid Monte Carlo method with a partially collapsed Gibbs (PCG) sampler. We apply the proposed GP model to a mouse dataset on age-related body weight.
基于网格的纵向遗传数据高斯过程模型
尽管已经开发了各种统计方法来绘制与时间相关的遗传因素,但大多数已确定的遗传变异只能解释纵向性状中估计的遗传变异的一小部分。已知基因-基因和基因-时间/环境相互作用是缺失遗传性的重要推定来源。然而,由于包含这种相互作用的模型的参数空间非常大,映射上位基因-基因相互作用是非常困难的。本文提出了一种基于高斯过程(GP)的非参数贝叶斯变量选择方法。它绘制了多个遗传标记,而不局限于成对的相互作用。GP模型不是显式地对每个主要和交互项进行建模,而是通过未指定的函数测量每个标记的重要性,而不管它主要是由于主要影响还是某些交互影响。为了提高GP模型的灵活性,我们提出了一种基于网格的主题内依赖结构的新方法。该方法可以精确逼近复杂的协方差结构。协方差矩阵的维数仅取决于固定网格点的数量,尽管每个受试者在不同的时间点可能有不同的测量次数。提出了偏差信息准则(DIC)和贝叶斯预测信息准则(BPIC)来选择网格点的最优数量。为了有效地提取后验样本,我们将混合蒙特卡罗方法与部分折叠吉布斯(PCG)采样器相结合。我们将提出的GP模型应用于与年龄相关的体重的小鼠数据集。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
0.90
自引率
0.00%
发文量
49
期刊介绍: Communications for Statistical Applications and Methods (Commun. Stat. Appl. Methods, CSAM) is an official journal of the Korean Statistical Society and Korean International Statistical Society. It is an international and Open Access journal dedicated to publishing peer-reviewed, high quality and innovative statistical research. CSAM publishes articles on applied and methodological research in the areas of statistics and probability. It features rapid publication and broad coverage of statistical applications and methods. It welcomes papers on novel applications of statistical methodology in the areas including medicine (pharmaceutical, biotechnology, medical device), business, management, economics, ecology, education, computing, engineering, operational research, biology, sociology and earth science, but papers from other areas are also considered.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信