VMGP: A unified variational auto-encoder based multi-task model for multi-phenotype, multi-environment, and cross-population genomic selection in plants

IF 8.2 Q1 AGRICULTURE, MULTIDISCIPLINARY
Xiangyu Zhao , Fuzhen Sun , Jinlong Li , Dongfeng Zhang , Qiusi Zhang , Zhongqiang Liu , Changwei Tan , Hongxiang Ma , Kaiyi Wang
{"title":"VMGP: A unified variational auto-encoder based multi-task model for multi-phenotype, multi-environment, and cross-population genomic selection in plants","authors":"Xiangyu Zhao ,&nbsp;Fuzhen Sun ,&nbsp;Jinlong Li ,&nbsp;Dongfeng Zhang ,&nbsp;Qiusi Zhang ,&nbsp;Zhongqiang Liu ,&nbsp;Changwei Tan ,&nbsp;Hongxiang Ma ,&nbsp;Kaiyi Wang","doi":"10.1016/j.aiia.2025.06.007","DOIUrl":null,"url":null,"abstract":"<div><div>Plant breeding stands as a cornerstone for agricultural productivity and the safeguarding of food security. The advent of Genomic Selection heralds a new epoch in breeding, characterized by its capacity to harness whole-genome variation for genomic prediction. This approach transcends the need for prior knowledge of genes associated with specific traits. Nonetheless, the vast dimensionality of genomic data juxtaposed with the relatively limited number of phenotypic samples often leads to the “curse of dimensionality”, where traditional statistical, machine learning, and deep learning methods are prone to overfitting and suboptimal predictive performance. To surmount this challenge, we introduce a unified Variational auto-encoder based Multi-task Genomic Prediction model (VMGP) that integrates self-supervised genomic compression and reconstruction with multiple prediction tasks. This approach provides a robust solution, offering a formidable predictive framework that has been rigorously validated across public datasets for wheat, rice, and maize. Our model demonstrates exceptional capabilities in multi-phenotype and multi-environment genomic prediction, successfully navigating the complexities of cross-population genomic selection and underscoring its unique strengths and utility. Furthermore, by integrating VMGP with model interpretability, we can effectively triage relevant single nucleotide polymorphisms, thereby enhancing prediction performance and proposing potential cost-effective genotyping solutions. The VMGP framework, with its simplicity, stable predictive prowess, and open-source code, is exceptionally well-suited for broad dissemination within plant breeding programs. It is particularly advantageous for breeders who prioritize phenotype prediction yet may not possess extensive knowledge in deep learning or proficiency in parameter tuning.</div></div>","PeriodicalId":52814,"journal":{"name":"Artificial Intelligence in Agriculture","volume":"15 4","pages":"Pages 829-842"},"PeriodicalIF":8.2000,"publicationDate":"2025-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence in Agriculture","FirstCategoryId":"1087","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2589721725000704","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURE, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

Abstract

Plant breeding stands as a cornerstone for agricultural productivity and the safeguarding of food security. The advent of Genomic Selection heralds a new epoch in breeding, characterized by its capacity to harness whole-genome variation for genomic prediction. This approach transcends the need for prior knowledge of genes associated with specific traits. Nonetheless, the vast dimensionality of genomic data juxtaposed with the relatively limited number of phenotypic samples often leads to the “curse of dimensionality”, where traditional statistical, machine learning, and deep learning methods are prone to overfitting and suboptimal predictive performance. To surmount this challenge, we introduce a unified Variational auto-encoder based Multi-task Genomic Prediction model (VMGP) that integrates self-supervised genomic compression and reconstruction with multiple prediction tasks. This approach provides a robust solution, offering a formidable predictive framework that has been rigorously validated across public datasets for wheat, rice, and maize. Our model demonstrates exceptional capabilities in multi-phenotype and multi-environment genomic prediction, successfully navigating the complexities of cross-population genomic selection and underscoring its unique strengths and utility. Furthermore, by integrating VMGP with model interpretability, we can effectively triage relevant single nucleotide polymorphisms, thereby enhancing prediction performance and proposing potential cost-effective genotyping solutions. The VMGP framework, with its simplicity, stable predictive prowess, and open-source code, is exceptionally well-suited for broad dissemination within plant breeding programs. It is particularly advantageous for breeders who prioritize phenotype prediction yet may not possess extensive knowledge in deep learning or proficiency in parameter tuning.
VMGP:一个基于统一变分自编码器的多任务模型,用于植物的多表型、多环境和跨群体基因组选择
植物育种是农业生产力和保障粮食安全的基石。基因组选择的出现预示着育种的新时代,其特点是能够利用全基因组变异进行基因组预测。这种方法超越了对与特定性状相关的基因的先验知识的需要。尽管如此,庞大的基因组数据维度与相对有限的表型样本数量并置于一起,往往导致“维度诅咒”,传统的统计、机器学习和深度学习方法容易出现过拟合和次优预测性能。为了克服这一挑战,我们引入了一个统一的基于变分自编码器的多任务基因组预测模型(VMGP),该模型将自监督基因组压缩和重建与多个预测任务集成在一起。这种方法提供了一个强大的解决方案,提供了一个强大的预测框架,该框架已在小麦、水稻和玉米的公共数据集中得到严格验证。我们的模型展示了在多表型和多环境基因组预测方面的卓越能力,成功地驾驭了跨种群基因组选择的复杂性,并强调了其独特的优势和实用性。此外,通过将VMGP与模型可解释性相结合,我们可以有效地分类相关的单核苷酸多态性,从而提高预测性能并提出潜在的经济有效的基因分型解决方案。VMGP框架具有简单、稳定的预测能力和开源代码,非常适合在植物育种项目中广泛传播。这是特别有利的育种者优先考虑表型预测,但可能不具备广泛的知识,在深度学习或熟练掌握参数调整。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Artificial Intelligence in Agriculture
Artificial Intelligence in Agriculture Engineering-Engineering (miscellaneous)
CiteScore
21.60
自引率
0.00%
发文量
18
审稿时长
12 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信