{"title":"Accurate genomic prediction for grain yield and grain moisture content of maize hybrids using multi-environment data.","authors":"Jingxin Wang, Liwei Liu, Kunhui He, Takele Weldu Gebrewahid, Shang Gao, Qingzhen Tian, Zhanyi Li, Yiqun Song, Yiliang Guo, Yanwei Li, Qinxin Cui, Luyan Zhang, Jiankang Wang, Changling Huang, Liang Li, Tingting Guo, Huihui Li","doi":"10.1111/jipb.13857","DOIUrl":null,"url":null,"abstract":"<p><p>Incorporating genotype-by-environment (GE) interaction effects into genomic prediction (GP) models with multi-environment climate data can improve selection accuracy to accelerate crop breeding but has received little research attention. Here, we conducted a cross-region GP study of grain moisture content (GMC) and grain yield (GY) in maize hybrids in two major Chinese growing regions using data for 19 climatic factors across 34 environments in 2020 and 2021. Predictions were conducted in 2,126 hybrids generated from 475 maize inbred lines, using 9,355 single nucleotide polymorphism markers for genotyping. Models based on genomic best linear unbiased prediction (GBLUP) incorporating GE interaction effects of 19 climatic factors associated with day length, transpiration, temperature, and radiation (GBLUP-GE<sub>19CF</sub>) trained on whole data set outperformed the traditional GBLUP or BayesB models in predicting GMC or GY by 10-fold cross-validation, achieving prediction accuracies of 0.731 and 0.331, respectively. To refine the climate data, we examined 84 statistical features associated with these climatic factors and identified nine factors most correlated with GMC or GY. Principal component analysis of climate data yielded nine principal components responsible for 97% of the variability in the data. Incorporating these nine factors or principal components into the GBLUP-GE framework with a similarity matrix of environments (GBLUP-GE<sub>9CF</sub> and GBLUP-GE<sub>PCA</sub>) provided similar prediction accuracies but could reduce the computational burden. In addition, increasing the number of test set environments in the training set from 8 to 14 increased the prediction accuracy of GBLUP-GE<sub>19CF</sub> trained with monthly average climate data for 2020-2021. Examining prediction accuracy based on concordance, the proportion of overlapping hybrids between the top 50% of predicted and observed values for GMC and GY, indicated that concordance exceeded 50% for the GBLUP-GE<sub>19CF</sub> model, confirming the reliability of our predictions. This study can provide practical guidance for optimizing GPs for maize breeding programs in multi-environment selection.</p>","PeriodicalId":195,"journal":{"name":"Journal of Integrative Plant Biology","volume":" ","pages":""},"PeriodicalIF":9.3000,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Integrative Plant Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1111/jipb.13857","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Incorporating genotype-by-environment (GE) interaction effects into genomic prediction (GP) models with multi-environment climate data can improve selection accuracy to accelerate crop breeding but has received little research attention. Here, we conducted a cross-region GP study of grain moisture content (GMC) and grain yield (GY) in maize hybrids in two major Chinese growing regions using data for 19 climatic factors across 34 environments in 2020 and 2021. Predictions were conducted in 2,126 hybrids generated from 475 maize inbred lines, using 9,355 single nucleotide polymorphism markers for genotyping. Models based on genomic best linear unbiased prediction (GBLUP) incorporating GE interaction effects of 19 climatic factors associated with day length, transpiration, temperature, and radiation (GBLUP-GE19CF) trained on whole data set outperformed the traditional GBLUP or BayesB models in predicting GMC or GY by 10-fold cross-validation, achieving prediction accuracies of 0.731 and 0.331, respectively. To refine the climate data, we examined 84 statistical features associated with these climatic factors and identified nine factors most correlated with GMC or GY. Principal component analysis of climate data yielded nine principal components responsible for 97% of the variability in the data. Incorporating these nine factors or principal components into the GBLUP-GE framework with a similarity matrix of environments (GBLUP-GE9CF and GBLUP-GEPCA) provided similar prediction accuracies but could reduce the computational burden. In addition, increasing the number of test set environments in the training set from 8 to 14 increased the prediction accuracy of GBLUP-GE19CF trained with monthly average climate data for 2020-2021. Examining prediction accuracy based on concordance, the proportion of overlapping hybrids between the top 50% of predicted and observed values for GMC and GY, indicated that concordance exceeded 50% for the GBLUP-GE19CF model, confirming the reliability of our predictions. This study can provide practical guidance for optimizing GPs for maize breeding programs in multi-environment selection.
期刊介绍:
Journal of Integrative Plant Biology is a leading academic journal reporting on the latest discoveries in plant biology.Enjoy the latest news and developments in the field, understand new and improved methods and research tools, and explore basic biological questions through reproducible experimental design, using genetic, biochemical, cell and molecular biological methods, and statistical analyses.