牛复杂性状候选因果基因优先排序的综合方法。

IF 3.7 2区 生物学 Q1 GENETICS & HEREDITY
PLoS Genetics Pub Date : 2025-05-30 eCollection Date: 2025-05-01 DOI:10.1371/journal.pgen.1011492
Mohammad Ghoreishifar, Iona M Macleod, Amanda J Chamberlain, Zhiqian Liu, Thomas J Lopdell, Mathew D Littlejohn, Ruidong Xiang, Jennie E Pryce, Michael E Goddard
{"title":"牛复杂性状候选因果基因优先排序的综合方法。","authors":"Mohammad Ghoreishifar, Iona M Macleod, Amanda J Chamberlain, Zhiqian Liu, Thomas J Lopdell, Mathew D Littlejohn, Ruidong Xiang, Jennie E Pryce, Michael E Goddard","doi":"10.1371/journal.pgen.1011492","DOIUrl":null,"url":null,"abstract":"<p><p>Genome-wide association studies (GWAS) have identified many quantitative trait loci (QTL) associated with complex traits, predominantly in non-coding regions, posing challenges in pinpointing the causal variants and their target genes. Three types of evidence can help identify the gene through which QTL acts: (1) proximity to the most significant GWAS variant, (2) correlation of gene expression with the trait, and (3) the gene's physiological role in the trait. However, there is still uncertainty about the success of these methods in identifying the correct genes. Here, we test the ability of these methods in a comparatively simple series of traits associated with the concentration of polar lipids in milk. We conducted single-trait GWAS for ~14 million imputed variants and 56 individual milk polar lipid (PL) phenotypes in 336 cows. A multi-trait meta-analysis of GWAS identified 10,063 significant SNPs at FDR ≤ 10% (P ≤ 7.15E-5). Transcriptome data from blood (~12.5K genes, 143 cows) and mammary tissue (~12.2K genes, 169 cows) were analyzed using the genetic score omics regression (GSOR) method. This method links observed gene expression to genetically predicted phenotypes and was used to find associations between gene expression and 56 PL phenotypes. GSOR identified 2,186 genes in blood and 1,404 in mammary tissue associated with at least one PL phenotype (FDR ≤ 1%). We partitioned the genome into non-overlapping windows of 100 Kb to test for overlap between GSOR-identified genes and GWAS signals. We found a significant overlap between these two datasets, indicating that GSOR-significant genes were more likely to be located within 100 Kb windows that include GWAS signals than those that do not (P = 0.01; odds ratio = 1.47). These windows included 70 significant genes expressed in mammary tissue and 95 in blood. Compared to all expressed genes in each tissue, these genes were enriched for lipid metabolism gene ontology (GO). That is, seven of the 70 significant mammary transcriptome genes (P < 0.01; odds ratio = 3.98) and five of the 95 significant blood genes (P < 0.10; odds ratio = 2.24) were involved in lipid metabolism GO. The candidate causal genes include DGAT1, ACSM5, SERINC5, ABHD3, CYP2U1, PIGL, ARV1, SMPD5, and NPC2, with some overlap between the two tissues. The overlap between GWAS, GSOR, and GO analyses suggests that together, these methods are more likely to identify genes mediating QTL, though their power remains limited, as reflected by modest odds ratios. Larger sample sizes would enhance the power of these analyses, but issues like linkage disequilibrium would remain.</p>","PeriodicalId":49007,"journal":{"name":"PLoS Genetics","volume":"21 5","pages":"e1011492"},"PeriodicalIF":3.7000,"publicationDate":"2025-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12158001/pdf/","citationCount":"0","resultStr":"{\"title\":\"An integrative approach to prioritize candidate causal genes for complex traits in cattle.\",\"authors\":\"Mohammad Ghoreishifar, Iona M Macleod, Amanda J Chamberlain, Zhiqian Liu, Thomas J Lopdell, Mathew D Littlejohn, Ruidong Xiang, Jennie E Pryce, Michael E Goddard\",\"doi\":\"10.1371/journal.pgen.1011492\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Genome-wide association studies (GWAS) have identified many quantitative trait loci (QTL) associated with complex traits, predominantly in non-coding regions, posing challenges in pinpointing the causal variants and their target genes. Three types of evidence can help identify the gene through which QTL acts: (1) proximity to the most significant GWAS variant, (2) correlation of gene expression with the trait, and (3) the gene's physiological role in the trait. However, there is still uncertainty about the success of these methods in identifying the correct genes. Here, we test the ability of these methods in a comparatively simple series of traits associated with the concentration of polar lipids in milk. We conducted single-trait GWAS for ~14 million imputed variants and 56 individual milk polar lipid (PL) phenotypes in 336 cows. A multi-trait meta-analysis of GWAS identified 10,063 significant SNPs at FDR ≤ 10% (P ≤ 7.15E-5). Transcriptome data from blood (~12.5K genes, 143 cows) and mammary tissue (~12.2K genes, 169 cows) were analyzed using the genetic score omics regression (GSOR) method. This method links observed gene expression to genetically predicted phenotypes and was used to find associations between gene expression and 56 PL phenotypes. GSOR identified 2,186 genes in blood and 1,404 in mammary tissue associated with at least one PL phenotype (FDR ≤ 1%). We partitioned the genome into non-overlapping windows of 100 Kb to test for overlap between GSOR-identified genes and GWAS signals. We found a significant overlap between these two datasets, indicating that GSOR-significant genes were more likely to be located within 100 Kb windows that include GWAS signals than those that do not (P = 0.01; odds ratio = 1.47). These windows included 70 significant genes expressed in mammary tissue and 95 in blood. Compared to all expressed genes in each tissue, these genes were enriched for lipid metabolism gene ontology (GO). That is, seven of the 70 significant mammary transcriptome genes (P < 0.01; odds ratio = 3.98) and five of the 95 significant blood genes (P < 0.10; odds ratio = 2.24) were involved in lipid metabolism GO. The candidate causal genes include DGAT1, ACSM5, SERINC5, ABHD3, CYP2U1, PIGL, ARV1, SMPD5, and NPC2, with some overlap between the two tissues. The overlap between GWAS, GSOR, and GO analyses suggests that together, these methods are more likely to identify genes mediating QTL, though their power remains limited, as reflected by modest odds ratios. Larger sample sizes would enhance the power of these analyses, but issues like linkage disequilibrium would remain.</p>\",\"PeriodicalId\":49007,\"journal\":{\"name\":\"PLoS Genetics\",\"volume\":\"21 5\",\"pages\":\"e1011492\"},\"PeriodicalIF\":3.7000,\"publicationDate\":\"2025-05-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12158001/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"PLoS Genetics\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1371/journal.pgen.1011492\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/5/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q1\",\"JCRName\":\"GENETICS & HEREDITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"PLoS Genetics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1371/journal.pgen.1011492","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/5/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0

摘要

全基因组关联研究(GWAS)已经发现了许多与复杂性状相关的数量性状位点(QTL),这些位点主要位于非编码区,这给确定因果变异及其靶基因带来了挑战。三种类型的证据可以帮助确定QTL作用的基因:(1)接近最显著的GWAS变异,(2)基因表达与性状的相关性,(3)基因在性状中的生理作用。然而,这些方法在识别正确基因方面是否成功仍存在不确定性。在这里,我们测试这些方法的能力,在一个相对简单的性状系列相关的极性脂质的浓度在牛奶。我们对336头奶牛进行了约1400万个输入变异和56个单独的乳极性脂(PL)表型的单性状GWAS。GWAS的多性状荟萃分析鉴定出10063个FDR≤10%的显著snp (P≤7.15E-5)。采用遗传评分组学回归(GSOR)方法分析血液(143头奶牛约12.5K个基因)和乳腺组织(169头奶牛约12.2K个基因)的转录组数据。该方法将观察到的基因表达与遗传预测表型联系起来,并用于发现基因表达与56种PL表型之间的关联。GSOR鉴定出血液中的2186个基因和乳腺组织中的1404个基因与至少一种PL表型相关(FDR≤1%)。我们将基因组划分为100 Kb的非重叠窗口,以测试gsor鉴定的基因与GWAS信号之间的重叠。我们发现这两个数据集之间存在显著的重叠,表明gsor显著基因更有可能位于包含GWAS信号的100 Kb窗口内,而不包含GWAS信号的100 Kb窗口内(P = 0.01;优势比= 1.47)。这些窗口包括70个在乳腺组织中表达的重要基因和95个在血液中表达的重要基因。与各组织中所有表达的基因相比,这些基因在脂质代谢基因本体(GO)中富集。也就是说,70个重要的乳腺转录组基因中有7个(P
本文章由计算机程序翻译,如有差异,请以英文原文为准。
An integrative approach to prioritize candidate causal genes for complex traits in cattle.

Genome-wide association studies (GWAS) have identified many quantitative trait loci (QTL) associated with complex traits, predominantly in non-coding regions, posing challenges in pinpointing the causal variants and their target genes. Three types of evidence can help identify the gene through which QTL acts: (1) proximity to the most significant GWAS variant, (2) correlation of gene expression with the trait, and (3) the gene's physiological role in the trait. However, there is still uncertainty about the success of these methods in identifying the correct genes. Here, we test the ability of these methods in a comparatively simple series of traits associated with the concentration of polar lipids in milk. We conducted single-trait GWAS for ~14 million imputed variants and 56 individual milk polar lipid (PL) phenotypes in 336 cows. A multi-trait meta-analysis of GWAS identified 10,063 significant SNPs at FDR ≤ 10% (P ≤ 7.15E-5). Transcriptome data from blood (~12.5K genes, 143 cows) and mammary tissue (~12.2K genes, 169 cows) were analyzed using the genetic score omics regression (GSOR) method. This method links observed gene expression to genetically predicted phenotypes and was used to find associations between gene expression and 56 PL phenotypes. GSOR identified 2,186 genes in blood and 1,404 in mammary tissue associated with at least one PL phenotype (FDR ≤ 1%). We partitioned the genome into non-overlapping windows of 100 Kb to test for overlap between GSOR-identified genes and GWAS signals. We found a significant overlap between these two datasets, indicating that GSOR-significant genes were more likely to be located within 100 Kb windows that include GWAS signals than those that do not (P = 0.01; odds ratio = 1.47). These windows included 70 significant genes expressed in mammary tissue and 95 in blood. Compared to all expressed genes in each tissue, these genes were enriched for lipid metabolism gene ontology (GO). That is, seven of the 70 significant mammary transcriptome genes (P < 0.01; odds ratio = 3.98) and five of the 95 significant blood genes (P < 0.10; odds ratio = 2.24) were involved in lipid metabolism GO. The candidate causal genes include DGAT1, ACSM5, SERINC5, ABHD3, CYP2U1, PIGL, ARV1, SMPD5, and NPC2, with some overlap between the two tissues. The overlap between GWAS, GSOR, and GO analyses suggests that together, these methods are more likely to identify genes mediating QTL, though their power remains limited, as reflected by modest odds ratios. Larger sample sizes would enhance the power of these analyses, but issues like linkage disequilibrium would remain.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
PLoS Genetics
PLoS Genetics GENETICS & HEREDITY-
自引率
2.20%
发文量
438
期刊介绍: PLOS Genetics is run by an international Editorial Board, headed by the Editors-in-Chief, Greg Barsh (HudsonAlpha Institute of Biotechnology, and Stanford University School of Medicine) and Greg Copenhaver (The University of North Carolina at Chapel Hill). Articles published in PLOS Genetics are archived in PubMed Central and cited in PubMed.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信