Using Genomic Context Informed Genotype Data and Within‐model Ancestry Adjustment to Classify Type 2 Diabetes

medRxiv - Genetic and Genomic Medicine Pub Date : 2024-09-13 DOI:10.1101/2024.09.12.24313579

Eric J Barnett, Yanli Zhang-James, Jonathan Hess, Stephen J Glatt, Stephen V Faraone

{"title":"Using Genomic Context Informed Genotype Data and Within‐model Ancestry Adjustment to Classify Type 2 Diabetes","authors":"Eric J Barnett, Yanli Zhang-James, Jonathan Hess, Stephen J Glatt, Stephen V Faraone","doi":"10.1101/2024.09.12.24313579","DOIUrl":null,"url":null,"abstract":"Despite high heritability estimates, complex genetic disorders have proven difficult to predict with genetic data. Genomic research has documented polygenic inheritance, cross-disorder genetic correlations, and enrichment of risk by functional genomic annotation, but the vast potential of that combined knowledge has not yet been leveraged to build optimal risk models. Additional methods are likely required to progress genetic risk models of complex genetic disorders towards clinical utility. We developed a framework that uses annotations providing genomic context alongside genotype data as input to convolutional neural networks to predict disorder risk. We validated models in a matched-pairs type 2 diabetes dataset. A neural network using genotype data (AUC: 0.66) and a convolutional neural network using context-informed genotype data (AUC: 0.65) both significantly outperformed polygenic risk score approaches in classifying type-2 diabetes. Adversarial ancestry tasks eliminated the predictability of ancestry without changing model performance.","PeriodicalId":501375,"journal":{"name":"medRxiv - Genetic and Genomic Medicine","volume":"13 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"medRxiv - Genetic and Genomic Medicine","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.09.12.24313579","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Despite high heritability estimates, complex genetic disorders have proven difficult to predict with genetic data. Genomic research has documented polygenic inheritance, cross-disorder genetic correlations, and enrichment of risk by functional genomic annotation, but the vast potential of that combined knowledge has not yet been leveraged to build optimal risk models. Additional methods are likely required to progress genetic risk models of complex genetic disorders towards clinical utility. We developed a framework that uses annotations providing genomic context alongside genotype data as input to convolutional neural networks to predict disorder risk. We validated models in a matched-pairs type 2 diabetes dataset. A neural network using genotype data (AUC: 0.66) and a convolutional neural network using context-informed genotype data (AUC: 0.65) both significantly outperformed polygenic risk score approaches in classifying type-2 diabetes. Adversarial ancestry tasks eliminated the predictability of ancestry without changing model performance.

查看原文本刊更多论文

利用基因组上下文信息基因型数据和模型内祖先调整对 2 型糖尿病进行分类

尽管遗传率估计值很高，但事实证明复杂的遗传性疾病很难通过遗传数据进行预测。基因组研究记录了多基因遗传、跨疾病遗传相关性以及功能基因组注释对风险的丰富，但尚未利用这些综合知识的巨大潜力来建立最佳风险模型。要将复杂遗传性疾病的遗传风险模型推向临床应用，可能还需要其他方法。我们开发了一个框架，利用提供基因组背景的注释和基因型数据作为卷积神经网络的输入来预测疾病风险。我们在配对的 2 型糖尿病数据集中验证了模型。在对 2 型糖尿病进行分类时，使用基因型数据的神经网络（AUC：0.66）和使用上下文信息基因型数据的卷积神经网络（AUC：0.65）都明显优于多基因风险评分方法。对抗性祖先任务消除了祖先的可预测性，却没有改变模型的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

medRxiv - Genetic and Genomic Medicine

自引率

0.00%

发文量