Robustness of the linear mixed effects model to error distribution assumptions and the consequences for genome-wide association studies.

Pub Date : 2014-10-01 DOI:10.1515/sagmb-2013-0066
Nicole M Warrington, Kate Tilling, Laura D Howe, Lavinia Paternoster, Craig E Pennell, Yan Yan Wu, Laurent Briollais
{"title":"Robustness of the linear mixed effects model to error distribution assumptions and the consequences for genome-wide association studies.","authors":"Nicole M Warrington,&nbsp;Kate Tilling,&nbsp;Laura D Howe,&nbsp;Lavinia Paternoster,&nbsp;Craig E Pennell,&nbsp;Yan Yan Wu,&nbsp;Laurent Briollais","doi":"10.1515/sagmb-2013-0066","DOIUrl":null,"url":null,"abstract":"<p><p>Genome-wide association studies have been successful in uncovering novel genetic variants that are associated with disease status or cross-sectional phenotypic traits. Researchers are beginning to investigate how genes play a role in the development of a trait over time. Linear mixed effects models (LMM) are commonly used to model longitudinal data; however, it is unclear if the failure to meet the models distributional assumptions will affect the conclusions when conducting a genome-wide association study. In an extensive simulation study, we compare coverage probabilities, bias, type 1 error rates and statistical power when the error of the LMM is either heteroscedastic or has a non-Gaussian distribution. We conclude that the model is robust to misspecification if the same function of age is included in the fixed and random effects. However, type 1 error of the genetic effect over time is inflated, regardless of the model misspecification, if the polynomial function for age in the fixed and random effects differs. In situations where the model will not converge with a high order polynomial function in the random effects, a reduced function can be used but a robust standard error needs to be calculated to avoid inflation of the type 1 error. As an illustration, a LMM was applied to longitudinal body mass index (BMI) data over childhood in the ALSPAC cohort; the results emphasised the need for the robust standard error to ensure correct inference of associations of longitudinal BMI with chromosome 16 single nucleotide polymorphisms.</p>","PeriodicalId":0,"journal":{"name":"","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/sagmb-2013-0066","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1515/sagmb-2013-0066","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 15

Abstract

Genome-wide association studies have been successful in uncovering novel genetic variants that are associated with disease status or cross-sectional phenotypic traits. Researchers are beginning to investigate how genes play a role in the development of a trait over time. Linear mixed effects models (LMM) are commonly used to model longitudinal data; however, it is unclear if the failure to meet the models distributional assumptions will affect the conclusions when conducting a genome-wide association study. In an extensive simulation study, we compare coverage probabilities, bias, type 1 error rates and statistical power when the error of the LMM is either heteroscedastic or has a non-Gaussian distribution. We conclude that the model is robust to misspecification if the same function of age is included in the fixed and random effects. However, type 1 error of the genetic effect over time is inflated, regardless of the model misspecification, if the polynomial function for age in the fixed and random effects differs. In situations where the model will not converge with a high order polynomial function in the random effects, a reduced function can be used but a robust standard error needs to be calculated to avoid inflation of the type 1 error. As an illustration, a LMM was applied to longitudinal body mass index (BMI) data over childhood in the ALSPAC cohort; the results emphasised the need for the robust standard error to ensure correct inference of associations of longitudinal BMI with chromosome 16 single nucleotide polymorphisms.

分享
查看原文
线性混合效应模型对误差分布假设的稳健性和全基因组关联研究的结果。
全基因组关联研究已经成功地揭示了与疾病状态或横断面表型性状相关的新型遗传变异。研究人员开始研究基因是如何随着时间的推移在一种特征的发展中发挥作用的。线性混合效应模型(LMM)是常用的纵向数据模型;然而,在进行全基因组关联研究时,不符合模型的分布假设是否会影响结论尚不清楚。在广泛的模拟研究中,我们比较了LMM误差为异方差或非高斯分布时的覆盖概率、偏差、1型错误率和统计功率。我们得出结论,如果在固定效应和随机效应中包含相同的年龄函数,则模型对错误规范具有鲁棒性。然而,如果固定效应和随机效应中的年龄多项式函数不同,则不管模型的错误说明如何,遗传效应随时间的第一类误差都会被夸大。在随机效应中模型不收敛于高阶多项式函数的情况下,可以使用简化函数,但需要计算鲁棒标准误差,以避免第一类误差的膨胀。作为一个例子,LMM应用于ALSPAC队列儿童时期的纵向体重指数(BMI)数据;结果强调需要稳健的标准误差,以确保正确推断纵向BMI与16号染色体单核苷酸多态性的关联。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信