Genome-wide association studies are enriched for interacting genes.

IF 4 3区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY
Peter T Nguyen, Simon G Coetzee, Irina Silacheva, Dennis J Hazelett
{"title":"Genome-wide association studies are enriched for interacting genes.","authors":"Peter T Nguyen, Simon G Coetzee, Irina Silacheva, Dennis J Hazelett","doi":"10.1186/s13040-024-00421-w","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>With recent advances in single cell technology, high-throughput methods provide unique insight into disease mechanisms and more importantly, cell type origin. Here, we used multi-omics data to understand how genetic variants from genome-wide association studies influence development of disease. We show in principle how to use genetic algorithms with normal, matching pairs of single-nucleus RNA- and ATAC-seq, genome annotations, and protein-protein interaction data to describe the genes and cell types collectively and their contribution to increased risk.</p><p><strong>Results: </strong>We used genetic algorithms to measure fitness of gene-cell set proposals against a series of objective functions that capture data and annotations. The highest information objective function captured protein-protein interactions. We observed significantly greater fitness scores and subgraph sizes in foreground vs. matching sets of control variants. Furthermore, our model reliably identified known targets and ligand-receptor pairs, consistent with prior studies.</p><p><strong>Conclusions: </strong>Our findings suggested that application of genetic algorithms to association studies can generate a coherent cellular model of risk from a set of susceptibility variants. Further, we showed, using breast cancer as an example, that such variants have a greater number of physical interactions than expected due to chance.</p>","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":"18 1","pages":"3"},"PeriodicalIF":4.0000,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11734473/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biodata Mining","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s13040-024-00421-w","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Background: With recent advances in single cell technology, high-throughput methods provide unique insight into disease mechanisms and more importantly, cell type origin. Here, we used multi-omics data to understand how genetic variants from genome-wide association studies influence development of disease. We show in principle how to use genetic algorithms with normal, matching pairs of single-nucleus RNA- and ATAC-seq, genome annotations, and protein-protein interaction data to describe the genes and cell types collectively and their contribution to increased risk.

Results: We used genetic algorithms to measure fitness of gene-cell set proposals against a series of objective functions that capture data and annotations. The highest information objective function captured protein-protein interactions. We observed significantly greater fitness scores and subgraph sizes in foreground vs. matching sets of control variants. Furthermore, our model reliably identified known targets and ligand-receptor pairs, consistent with prior studies.

Conclusions: Our findings suggested that application of genetic algorithms to association studies can generate a coherent cellular model of risk from a set of susceptibility variants. Further, we showed, using breast cancer as an example, that such variants have a greater number of physical interactions than expected due to chance.

全基因组关联研究丰富了相互作用基因。
背景:随着单细胞技术的最新进展,高通量方法为疾病机制和更重要的细胞类型起源提供了独特的见解。在这里,我们使用多组学数据来了解来自全基因组关联研究的遗传变异如何影响疾病的发展。我们原则上展示了如何使用遗传算法与正常的、匹配的单核RNA-和ATAC-seq对、基因组注释和蛋白质-蛋白质相互作用数据来共同描述基因和细胞类型及其对风险增加的贡献。结果:我们使用遗传算法来测量针对一系列目标函数的基因细胞集建议的适应度,这些目标函数捕获数据和注释。最高信息目标函数捕获蛋白质-蛋白质相互作用。我们观察到前景的适应度得分和子图大小明显高于匹配控制变量集。此外,我们的模型可靠地识别了已知的靶标和配体受体对,与先前的研究一致。结论:我们的研究结果表明,将遗传算法应用于关联研究可以从一组易感性变异中产生连贯的风险细胞模型。此外,我们以乳腺癌为例表明,由于偶然性,这些变异具有比预期更多的物理相互作用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Biodata Mining
Biodata Mining MATHEMATICAL & COMPUTATIONAL BIOLOGY-
CiteScore
7.90
自引率
0.00%
发文量
28
审稿时长
23 weeks
期刊介绍: BioData Mining is an open access, open peer-reviewed journal encompassing research on all aspects of data mining applied to high-dimensional biological and biomedical data, focusing on computational aspects of knowledge discovery from large-scale genetic, transcriptomic, genomic, proteomic, and metabolomic data. Topical areas include, but are not limited to: -Development, evaluation, and application of novel data mining and machine learning algorithms. -Adaptation, evaluation, and application of traditional data mining and machine learning algorithms. -Open-source software for the application of data mining and machine learning algorithms. -Design, development and integration of databases, software and web services for the storage, management, retrieval, and analysis of data from large scale studies. -Pre-processing, post-processing, modeling, and interpretation of data mining and machine learning results for biological interpretation and knowledge discovery.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信