Integrating gene expression data via weighted multiple kernel ridge regression improved accuracy of genomic prediction

IF 3.1 1区 农林科学 Q1 AGRICULTURE, DAIRY & ANIMAL SCIENCE
Xue Wang, Jingfang Si, Yachun Wang, Lingzhao Fang, Zhe Zhang, Yi Zhang
{"title":"Integrating gene expression data via weighted multiple kernel ridge regression improved accuracy of genomic prediction","authors":"Xue Wang, Jingfang Si, Yachun Wang, Lingzhao Fang, Zhe Zhang, Yi Zhang","doi":"10.1186/s12711-025-00997-9","DOIUrl":null,"url":null,"abstract":"Gene expression profiles hold potentially valuable information for the prediction of breeding values and phenotypes. However, in practical breeding programs, most reference population individuals typically have only genomic data, lacking transcriptomic data. Predicting gene expression based on genetic markers and integrating the genetically predicted gene expression data into genomic prediction may offer a potential solution. This study extends kernel ridge regression (KRR) to weighted multiple kernel ridge regression (WMKRR), which integrates genomic data and transcriptomic data predicted from genetic markers through a multiple kernel learning (MKL) approach. We evaluated the predictive ability of WMKRR compared to traditional genomic best linear unbiased prediction (GBLUP) and a combined genomic and transcriptomic best linear unbiased prediction (GTBLUP) in both genotype feature selection and non-feature selection scenarios in two datasets: (i) 3305 simulated data based on the Cattle Genotype-Tissue Expression (CattleGTEx) dataset, (ii) 5515 real dairy cattle data. Our results show that WMKRR yielded higher predictive abilities than GBLUP And GTBLUP in both simulated And real dairy cattle data. For the simulated data based on CattleGTEx, WMKRR achieved an average improvement in predictive ability of 1.12% And 1.13% over GBLUP And GTBLUP, respectively, under the non-feature selection scenario, And 3.17% And 3.23%, respectively, under the feature selection scenario. For the real dairy cattle data, in cross-validation, WMKRR improved over GBLUP And GTBLUP by An average of 5.56% And 7.23%, respectively, without feature selection, And by 5.66% And 6.40%, respectively, with feature selection. In forward validation, WMKRR improved over GBLUP And GTBLUP by An average of 5.68% And 8.41%, respectively, without feature selection, And by 4.66% And 7.06%, respectively, with feature selection. Our result demonstrates that the WMKRR model, which integrates genomic and genetically predicted transcriptomic data, achieves better prediction performance compared to traditional genomic prediction models. This study showed the potential of enhanced genomic breeding application using omics data with no further omics sequencing cost.","PeriodicalId":55120,"journal":{"name":"Genetics Selection Evolution","volume":"13 1","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2025-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genetics Selection Evolution","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s12711-025-00997-9","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURE, DAIRY & ANIMAL SCIENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Gene expression profiles hold potentially valuable information for the prediction of breeding values and phenotypes. However, in practical breeding programs, most reference population individuals typically have only genomic data, lacking transcriptomic data. Predicting gene expression based on genetic markers and integrating the genetically predicted gene expression data into genomic prediction may offer a potential solution. This study extends kernel ridge regression (KRR) to weighted multiple kernel ridge regression (WMKRR), which integrates genomic data and transcriptomic data predicted from genetic markers through a multiple kernel learning (MKL) approach. We evaluated the predictive ability of WMKRR compared to traditional genomic best linear unbiased prediction (GBLUP) and a combined genomic and transcriptomic best linear unbiased prediction (GTBLUP) in both genotype feature selection and non-feature selection scenarios in two datasets: (i) 3305 simulated data based on the Cattle Genotype-Tissue Expression (CattleGTEx) dataset, (ii) 5515 real dairy cattle data. Our results show that WMKRR yielded higher predictive abilities than GBLUP And GTBLUP in both simulated And real dairy cattle data. For the simulated data based on CattleGTEx, WMKRR achieved an average improvement in predictive ability of 1.12% And 1.13% over GBLUP And GTBLUP, respectively, under the non-feature selection scenario, And 3.17% And 3.23%, respectively, under the feature selection scenario. For the real dairy cattle data, in cross-validation, WMKRR improved over GBLUP And GTBLUP by An average of 5.56% And 7.23%, respectively, without feature selection, And by 5.66% And 6.40%, respectively, with feature selection. In forward validation, WMKRR improved over GBLUP And GTBLUP by An average of 5.68% And 8.41%, respectively, without feature selection, And by 4.66% And 7.06%, respectively, with feature selection. Our result demonstrates that the WMKRR model, which integrates genomic and genetically predicted transcriptomic data, achieves better prediction performance compared to traditional genomic prediction models. This study showed the potential of enhanced genomic breeding application using omics data with no further omics sequencing cost.
利用加权多核脊回归对基因表达数据进行整合,提高了基因组预测的准确性
基因表达谱对育种价值和表型的预测具有潜在的有价值的信息。然而,在实际的育种计划中,大多数参考种群个体通常只有基因组数据,缺乏转录组数据。基于遗传标记预测基因表达并将遗传预测的基因表达数据整合到基因组预测中可能提供一种潜在的解决方案。本研究将核脊回归(KRR)扩展到加权多核脊回归(WMKRR),该方法通过多核学习(MKL)方法整合遗传标记预测的基因组数据和转录组数据。在基因型特征选择和非特征选择两种情况下,我们将WMKRR的预测能力与传统的基因组最佳线性无偏预测(GBLUP)和基因组和转录组最佳线性无偏预测(GTBLUP)进行了比较:(i)基于牛基因型-组织表达(CattleGTEx)数据集的3305个模拟数据,(ii) 5515头真实奶牛数据。结果表明,在模拟和真实奶牛数据中,WMKRR的预测能力均高于GBLUP和GTBLUP。对于基于catlegtex的模拟数据,WMKRR在非特征选择场景下比GBLUP和GTBLUP的预测能力平均提高1.12%和1.13%,在特征选择场景下比GBLUP和GTBLUP的预测能力平均提高3.17%和3.23%。对于真实的奶牛数据,在交叉验证中,WMKRR在没有特征选择的情况下比GBLUP和GTBLUP平均分别提高了5.56%和7.23%,在有特征选择的情况下分别提高了5.66%和6.40%。在前向验证中,WMKRR在没有特征选择的情况下比GBLUP和GTBLUP平均分别提高5.68%和8.41%,在有特征选择的情况下比GBLUP和GTBLUP平均分别提高4.66%和7.06%。我们的研究结果表明,与传统的基因组预测模型相比,整合基因组和遗传预测转录组数据的WMKRR模型具有更好的预测性能。这项研究显示了利用组学数据增强基因组育种应用的潜力,而无需进一步的组学测序成本。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Genetics Selection Evolution
Genetics Selection Evolution 生物-奶制品与动物科学
CiteScore
6.50
自引率
9.80%
发文量
74
审稿时长
1 months
期刊介绍: Genetics Selection Evolution invites basic, applied and methodological content that will aid the current understanding and the utilization of genetic variability in domestic animal species. Although the focus is on domestic animal species, research on other species is invited if it contributes to the understanding of the use of genetic variability in domestic animals. Genetics Selection Evolution publishes results from all levels of study, from the gene to the quantitative trait, from the individual to the population, the breed or the species. Contributions concerning both the biological approach, from molecular genetics to quantitative genetics, as well as the mathematical approach, from population genetics to statistics, are welcome. Specific areas of interest include but are not limited to: gene and QTL identification, mapping and characterization, analysis of new phenotypes, high-throughput SNP data analysis, functional genomics, cytogenetics, genetic diversity of populations and breeds, genetic evaluation, applied and experimental selection, genomic selection, selection efficiency, and statistical methodology for the genetic analysis of phenotypes with quantitative and mixed inheritance.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信