Genomic and phenomic prediction for soybean seed yield, protein, and oil.

IF 3.9 2区生物学 Q1 GENETICS & HEREDITY

Plant Genome Pub Date : 2025-03-01 DOI:10.1002/tpg2.70002

Liza Van der Laan, Kyle Parmley, Mojdeh Saadati, Hernan Torres Pacin, Srikanth Panthulugiri, Soumik Sarkar, Baskar Ganapathysubramanian, Aaron Lorenz, Asheesh K Singh

{"title":"Genomic and phenomic prediction for soybean seed yield, protein, and oil.","authors":"Liza Van der Laan, Kyle Parmley, Mojdeh Saadati, Hernan Torres Pacin, Srikanth Panthulugiri, Soumik Sarkar, Baskar Ganapathysubramanian, Aaron Lorenz, Asheesh K Singh","doi":"10.1002/tpg2.70002","DOIUrl":null,"url":null,"abstract":"<p><p>Developments in genomics and phenomics have provided valuable tools for use in cultivar development. Genomic prediction (GP) has been used in commercial soybean [Glycine max L. (Merr.)] breeding programs to predict grain yield and seed composition traits. Phenomic prediction (PP) is a rapidly developing field that holds the potential to be used for the selection of genotypes early in the growing season. The objectives of this study were to compare the performance of GP and PP for predicting soybean seed yield, protein, and oil. We additionally conducted genome-wide association studies (GWAS) to identify significant single-nucleotide polymorphisms (SNPs) associated with the traits of interest. The GWAS panel of 292 diverse accessions was grown in six environments in replicated trials. Spectral data were collected at two time points during the growing season. A genomic best linear unbiased prediction (GBLUP) model was trained on 269 accessions, while three separate machine learning (ML) models were trained on vegetation indices (VIs) and canopy traits. We observed that PP had a higher correlation coefficient than GP for seed yield, while GP had higher correlation coefficients for seed protein and oil contents. VIs with high feature importance were used as covariates in a new GBLUP model, and a new random forest model was trained with the inclusion of selected SNPs. These models did not outperform the original GP and PP models. These results show the capability of using ML for in-season predictions for specific traits in soybean breeding and provide insights on PP and GP inclusions in breeding programs.</p>","PeriodicalId":49002,"journal":{"name":"Plant Genome","volume":"18 1","pages":"e70002"},"PeriodicalIF":3.9000,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11839941/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Plant Genome","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1002/tpg2.70002","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}

引用次数: 0

Abstract

Developments in genomics and phenomics have provided valuable tools for use in cultivar development. Genomic prediction (GP) has been used in commercial soybean [Glycine max L. (Merr.)] breeding programs to predict grain yield and seed composition traits. Phenomic prediction (PP) is a rapidly developing field that holds the potential to be used for the selection of genotypes early in the growing season. The objectives of this study were to compare the performance of GP and PP for predicting soybean seed yield, protein, and oil. We additionally conducted genome-wide association studies (GWAS) to identify significant single-nucleotide polymorphisms (SNPs) associated with the traits of interest. The GWAS panel of 292 diverse accessions was grown in six environments in replicated trials. Spectral data were collected at two time points during the growing season. A genomic best linear unbiased prediction (GBLUP) model was trained on 269 accessions, while three separate machine learning (ML) models were trained on vegetation indices (VIs) and canopy traits. We observed that PP had a higher correlation coefficient than GP for seed yield, while GP had higher correlation coefficients for seed protein and oil contents. VIs with high feature importance were used as covariates in a new GBLUP model, and a new random forest model was trained with the inclusion of selected SNPs. These models did not outperform the original GP and PP models. These results show the capability of using ML for in-season predictions for specific traits in soybean breeding and provide insights on PP and GP inclusions in breeding programs.

查看原文本刊更多论文

大豆种子产量、蛋白质和油脂的基因组和表型预测。

基因组学和表型组学的发展为品种开发提供了有价值的工具。基因组预测（GP）已被用于商业大豆育种计划，以预测籽粒产量和种子组成性状。表型预测（PP）是一个快速发展的领域，具有在生长季早期用于基因型选择的潜力。本研究的目的是比较GP和PP在预测大豆种子产量、蛋白质和油脂方面的性能。我们还进行了全基因组关联研究（GWAS），以确定与感兴趣的性状相关的显著单核苷酸多态性（snp）。292个不同的GWAS小组在6个环境中进行了重复试验。光谱数据在生长季节的两个时间点采集。利用基因组最佳线性无偏预测（GBLUP）模型对269份资料进行了训练，同时对植被指数（VIs）和冠层性状进行了三个独立的机器学习（ML）模型的训练。结果表明，PP籽粒产量的相关系数高于GP，而GP籽粒蛋白质和含油量的相关系数高于GP。在新的GBLUP模型中使用具有高特征重要性的VIs作为协变量，并使用包含选定snp的新随机森林模型进行训练。这些模型并没有超过原来的GP和PP模型。这些结果表明，使用ML对大豆育种中的特定性状进行季内预测的能力，并为育种计划中的PP和GP内含物提供了见解。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Plant Genome PLANT SCIENCES-GENETICS & HEREDITY

CiteScore

6.00

自引率

4.80%

发文量

审稿时长

>12 weeks

期刊介绍： The Plant Genome publishes original research investigating all aspects of plant genomics. Technical breakthroughs reporting improvements in the efficiency and speed of acquiring and interpreting plant genomics data are welcome. The editorial board gives preference to novel reports that use innovative genomic applications that advance our understanding of plant biology that may have applications to crop improvement. The journal also publishes invited review articles and perspectives that offer insight and commentary on recent advances in genomics and their potential for agronomic improvement.