利用比利时蓝牛全基因组序列数据和功能注释评估基因组选择模型

IF 3.1 1区农林科学 Q1 AGRICULTURE, DAIRY & ANIMAL SCIENCE

Genetics Selection Evolution Pub Date : 2025-03-04 DOI:10.1186/s12711-025-00955-5

Can Yuan, Alain Gillon, José Luis Gualdrón Duarte, Haruko Takeda, Wouter Coppieters, Michel Georges, Tom Druet

{"title":"利用比利时蓝牛全基因组序列数据和功能注释评估基因组选择模型","authors":"Can Yuan, Alain Gillon, José Luis Gualdrón Duarte, Haruko Takeda, Wouter Coppieters, Michel Georges, Tom Druet","doi":"10.1186/s12711-025-00955-5","DOIUrl":null,"url":null,"abstract":"The availability of large cohorts of whole-genome sequenced individuals, combined with functional annotation, is expected to provide opportunities to improve the accuracy of genomic selection (GS). However, such benefits have not often been observed in initial applications. The reference population for GS in Belgian Blue Cattle (BBC) continues to grow. Combined with the availability of reference panels of sequenced individuals, it provides an opportunity to evaluate GS models using whole genome sequence (WGS) data and functional annotation. Here, we used data from 16,508 cows, with phenotypes for five muscular development traits and imputed at the WGS level, in combination with in silico functional annotation and catalogs of putative regulatory variants obtained from experimental data. We evaluated first GS models using the entire WGS data, with or without functional annotation. At this marker density, we were able to run two approaches, assuming either a highly polygenic architecture (GBLUP) or allowing some variants to have larger effects (BayesRR-RC, a Bayesian mixture model), and observed an increased reliability compared to the official GBLUP model at medium marker density (on average 0.016 and 0.018 for GBLUP and BayesRR-RC, respectively). When functional annotation was used, we observed slightly higher reliabilities with an extension of GBLUP that included multiple polygenic terms (one per functional group), while reliabilities decreased with BayesRR-RC. We then used large subsets of variants selected based on functional information or with a linkage disequilibrium (LD) pruning approach, which allowed us to evaluate two additional approaches, BayesCπ and Bayesian Sparse Linear Mixed Model (BSLMM). Reliabilities were higher for these panels than for the WGS data, with the highest accuracies obtained when markers were selected based on functional information. In our setting, BSLMM systematically achieved higher reliabilities than other methods. GS with large panels of functional variants selected from WGS data allowed a significant increase in reliability compared to the official genomic evaluation approach. However, the benefits of using WGS and functional data remained modest, indicating that there is still room for improvement, for example by further refining the functional annotation in the BBC breed.","PeriodicalId":55120,"journal":{"name":"Genetics Selection Evolution","volume":"34 1","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Evaluation of genomic selection models using whole genome sequence data and functional annotation in Belgian Blue cattle\",\"authors\":\"Can Yuan, Alain Gillon, José Luis Gualdrón Duarte, Haruko Takeda, Wouter Coppieters, Michel Georges, Tom Druet\",\"doi\":\"10.1186/s12711-025-00955-5\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The availability of large cohorts of whole-genome sequenced individuals, combined with functional annotation, is expected to provide opportunities to improve the accuracy of genomic selection (GS). However, such benefits have not often been observed in initial applications. The reference population for GS in Belgian Blue Cattle (BBC) continues to grow. Combined with the availability of reference panels of sequenced individuals, it provides an opportunity to evaluate GS models using whole genome sequence (WGS) data and functional annotation. Here, we used data from 16,508 cows, with phenotypes for five muscular development traits and imputed at the WGS level, in combination with in silico functional annotation and catalogs of putative regulatory variants obtained from experimental data. We evaluated first GS models using the entire WGS data, with or without functional annotation. At this marker density, we were able to run two approaches, assuming either a highly polygenic architecture (GBLUP) or allowing some variants to have larger effects (BayesRR-RC, a Bayesian mixture model), and observed an increased reliability compared to the official GBLUP model at medium marker density (on average 0.016 and 0.018 for GBLUP and BayesRR-RC, respectively). When functional annotation was used, we observed slightly higher reliabilities with an extension of GBLUP that included multiple polygenic terms (one per functional group), while reliabilities decreased with BayesRR-RC. We then used large subsets of variants selected based on functional information or with a linkage disequilibrium (LD) pruning approach, which allowed us to evaluate two additional approaches, BayesCπ and Bayesian Sparse Linear Mixed Model (BSLMM). Reliabilities were higher for these panels than for the WGS data, with the highest accuracies obtained when markers were selected based on functional information. In our setting, BSLMM systematically achieved higher reliabilities than other methods. GS with large panels of functional variants selected from WGS data allowed a significant increase in reliability compared to the official genomic evaluation approach. However, the benefits of using WGS and functional data remained modest, indicating that there is still room for improvement, for example by further refining the functional annotation in the BBC breed.\",\"PeriodicalId\":55120,\"journal\":{\"name\":\"Genetics Selection Evolution\",\"volume\":\"34 1\",\"pages\":\"\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2025-03-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Genetics Selection Evolution\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1186/s12711-025-00955-5\",\"RegionNum\":1,\"RegionCategory\":\"农林科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AGRICULTURE, DAIRY & ANIMAL SCIENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genetics Selection Evolution","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s12711-025-00955-5","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURE, DAIRY & ANIMAL SCIENCE","Score":null,"Total":0}

引用次数: 0

摘要

大量全基因组测序个体的可用性，结合功能注释，有望为提高基因组选择（GS）的准确性提供机会。然而，这些好处在最初的应用中并不经常被观察到。比利时蓝牛（BBC）中GS的参考种群继续增长。结合已测序个体参考面板的可用性，它提供了利用全基因组序列（WGS）数据和功能注释来评估GS模型的机会。在这里，我们使用了16,508头奶牛的数据，这些数据具有5种肌肉发育性状的表型，并在WGS水平上进行了估算，结合了从实验数据中获得的计算机功能注释和推测的调节变异目录。我们使用完整的WGS数据评估了第一个GS模型，有或没有功能注释。在这个标记密度下，我们能够运行两种方法，假设一个高多基因结构（GBLUP）或允许一些变体有更大的影响（BayesRR-RC，贝叶斯混合模型），并观察到与中等标记密度下的官方GBLUP模型相比，可靠性有所提高（GBLUP和BayesRR-RC的平均可靠性分别为0.016和0.018）。当使用功能注释时，我们观察到使用包含多个多基因术语（每个功能组一个）的GBLUP扩展的可靠性略高，而使用BayesRR-RC的可靠性降低。然后，我们使用基于功能信息或链接不平衡（LD）修剪方法选择的变体的大子集，这使我们能够评估另外两种方法，bayescc π和贝叶斯稀疏线性混合模型（BSLMM）。这些面板的可靠性高于WGS数据，根据功能信息选择标记时获得的准确性最高。在我们的设置中，BSLMM系统地获得了比其他方法更高的可靠性。与官方基因组评估方法相比，从WGS数据中选择大量功能变异的GS可以显著提高可靠性。然而，使用WGS和功能数据的好处仍然有限，这表明仍有改进的空间，例如，可以进一步完善BBC品种中的功能注释。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Evaluation of genomic selection models using whole genome sequence data and functional annotation in Belgian Blue cattle

The availability of large cohorts of whole-genome sequenced individuals, combined with functional annotation, is expected to provide opportunities to improve the accuracy of genomic selection (GS). However, such benefits have not often been observed in initial applications. The reference population for GS in Belgian Blue Cattle (BBC) continues to grow. Combined with the availability of reference panels of sequenced individuals, it provides an opportunity to evaluate GS models using whole genome sequence (WGS) data and functional annotation. Here, we used data from 16,508 cows, with phenotypes for five muscular development traits and imputed at the WGS level, in combination with in silico functional annotation and catalogs of putative regulatory variants obtained from experimental data. We evaluated first GS models using the entire WGS data, with or without functional annotation. At this marker density, we were able to run two approaches, assuming either a highly polygenic architecture (GBLUP) or allowing some variants to have larger effects (BayesRR-RC, a Bayesian mixture model), and observed an increased reliability compared to the official GBLUP model at medium marker density (on average 0.016 and 0.018 for GBLUP and BayesRR-RC, respectively). When functional annotation was used, we observed slightly higher reliabilities with an extension of GBLUP that included multiple polygenic terms (one per functional group), while reliabilities decreased with BayesRR-RC. We then used large subsets of variants selected based on functional information or with a linkage disequilibrium (LD) pruning approach, which allowed us to evaluate two additional approaches, BayesCπ and Bayesian Sparse Linear Mixed Model (BSLMM). Reliabilities were higher for these panels than for the WGS data, with the highest accuracies obtained when markers were selected based on functional information. In our setting, BSLMM systematically achieved higher reliabilities than other methods. GS with large panels of functional variants selected from WGS data allowed a significant increase in reliability compared to the official genomic evaluation approach. However, the benefits of using WGS and functional data remained modest, indicating that there is still room for improvement, for example by further refining the functional annotation in the BBC breed.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Genetics Selection Evolution 生物-奶制品与动物科学

CiteScore

6.50

自引率

9.80%

发文量

审稿时长

1 months

期刊介绍： Genetics Selection Evolution invites basic, applied and methodological content that will aid the current understanding and the utilization of genetic variability in domestic animal species. Although the focus is on domestic animal species, research on other species is invited if it contributes to the understanding of the use of genetic variability in domestic animals. Genetics Selection Evolution publishes results from all levels of study, from the gene to the quantitative trait, from the individual to the population, the breed or the species. Contributions concerning both the biological approach, from molecular genetics to quantitative genetics, as well as the mathematical approach, from population genetics to statistics, are welcome. Specific areas of interest include but are not limited to: gene and QTL identification, mapping and characterization, analysis of new phenotypes, high-throughput SNP data analysis, functional genomics, cytogenetics, genetic diversity of populations and breeds, genetic evaluation, applied and experimental selection, genomic selection, selection efficiency, and statistical methodology for the genetic analysis of phenotypes with quantitative and mixed inheritance.