Genotype inference from aggregated chromatin accessibility data reveals genetic regulatory mechanisms

Brandon M. Wenz, Yuan He, Nae-Chyun Chen, Joseph K. Pickrell, Jeremiah H Li, Max F. Dudek, Taibo Li, Rebecca Keener, Benjamin F. Voight, Christopher D. Brown, Alexis Battle
{"title":"Genotype inference from aggregated chromatin accessibility data reveals genetic regulatory mechanisms","authors":"Brandon M. Wenz, Yuan He, Nae-Chyun Chen, Joseph K. Pickrell, Jeremiah H Li, Max F. Dudek, Taibo Li, Rebecca Keener, Benjamin F. Voight, Christopher D. Brown, Alexis Battle","doi":"10.1101/2024.09.04.610850","DOIUrl":null,"url":null,"abstract":"Background\nUnderstanding the genetic causes for variability in chromatin accessibility can shed light on the molecular mechanisms through which genetic variants may affect complex traits. Thousands of ATAC-seq samples have been collected that hold information about chromatin accessibility across diverse cell types and contexts, but most of these are not paired with genetic information and come from diverse distinct projects and laboratories. Results\nWe report here joint genotyping, chromatin accessibility peak calling, and discovery of quantitative trait loci which influence chromatin accessibility (caQTLs), demonstrating the capability of performing caQTL analysis on a large scale in a diverse sample set without pre-existing genotype information. Using 10,293 profiling samples representing 1,454 unique donor individuals across 653 studies from public databases, we catalog 23,381 caQTLs in total. After joint discovery analysis, we cluster samples based on accessible chromatin profiles to identify context-specific caQTLs. We find that caQTLs are strongly enriched for annotations of gene regulatory elements across diverse cell types and tissues and are often strongly linked with genetic variation associated with changes in expression (eQTLs), indicating that caQTLs can mediate genetic effects on gene expression. We demonstrate sharing of causal variants for chromatin accessibility and diverse complex human traits, enabling a more complete picture of the genetic mechanisms underlying complex human phenotypes. Conclusions\nOur work provides a proof of principle for caQTL calling from previously ungenotyped samples, and represents one of the largest, most diverse caQTL resources currently available, informing mechanisms of genetic regulation of gene expression and contribution to disease.","PeriodicalId":501161,"journal":{"name":"bioRxiv - Genomics","volume":"2 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"bioRxiv - Genomics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.09.04.610850","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Background Understanding the genetic causes for variability in chromatin accessibility can shed light on the molecular mechanisms through which genetic variants may affect complex traits. Thousands of ATAC-seq samples have been collected that hold information about chromatin accessibility across diverse cell types and contexts, but most of these are not paired with genetic information and come from diverse distinct projects and laboratories. Results We report here joint genotyping, chromatin accessibility peak calling, and discovery of quantitative trait loci which influence chromatin accessibility (caQTLs), demonstrating the capability of performing caQTL analysis on a large scale in a diverse sample set without pre-existing genotype information. Using 10,293 profiling samples representing 1,454 unique donor individuals across 653 studies from public databases, we catalog 23,381 caQTLs in total. After joint discovery analysis, we cluster samples based on accessible chromatin profiles to identify context-specific caQTLs. We find that caQTLs are strongly enriched for annotations of gene regulatory elements across diverse cell types and tissues and are often strongly linked with genetic variation associated with changes in expression (eQTLs), indicating that caQTLs can mediate genetic effects on gene expression. We demonstrate sharing of causal variants for chromatin accessibility and diverse complex human traits, enabling a more complete picture of the genetic mechanisms underlying complex human phenotypes. Conclusions Our work provides a proof of principle for caQTL calling from previously ungenotyped samples, and represents one of the largest, most diverse caQTL resources currently available, informing mechanisms of genetic regulation of gene expression and contribution to disease.
从染色质可及性聚合数据推断基因型,揭示遗传调控机制
背景了解染色质可及性变异的遗传原因可以揭示遗传变异可能影响复杂性状的分子机制。目前已收集了成千上万的 ATAC-seq 样本,这些样本包含了不同细胞类型和环境中染色质可及性的信息,但其中大部分样本没有与遗传信息配对,而且来自不同的项目和实验室。结果我们在此报告了联合基因分型、染色质可及性峰值调用以及影响染色质可及性的定量性状位点(caQTLs)的发现,展示了在没有预先存在的基因型信息的情况下在不同样本集中大规模执行 caQTL 分析的能力。我们利用来自公共数据库的 653 项研究的 10,293 个分析样本,代表了 1,454 个独特的捐赠者个体,共对 23,381 个 caQTL 进行了编目。在联合发现分析之后,我们根据可访问的染色质图谱对样本进行聚类,以确定特定背景的 caQTL。我们发现,caQTLs 在不同细胞类型和组织的基因调控元件注释中具有很强的富集性,而且往往与表达变化相关的遗传变异(eQTLs)密切相关,这表明 caQTLs 可以介导基因表达的遗传效应。我们证明了染色质可及性和人类各种复杂性状的因果变异的共享性,从而能够更全面地了解人类复杂表型的遗传机制。结论我们的工作证明了从以前未分型的样本中进行 caQTL 调用的原理,是目前可用的最大、最多样化的 caQTL 资源之一,为基因表达的遗传调控机制和对疾病的影响提供了信息。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信