Julia Angelini, Gerardo D. L. Cervigni, Marta B. Quaglino
{"title":"New imputation methodologies for genotype-by-environment data: an extensive study of properties of estimators","authors":"Julia Angelini, Gerardo D. L. Cervigni, Marta B. Quaglino","doi":"10.1007/s10681-024-03344-z","DOIUrl":null,"url":null,"abstract":"<p>The site regression model (SREG) is utilized by plant breeders for analyzing multi-environment trials (MET) to examine the relationships among test environments, genotypes (G), and genotype-by-environment interactions (GE). SREG explores a matrix of G and GE by performing a singular value decomposition on the residuals matrix from a one-way ANOVA, requiring complete data. As missing values are common in MET, we propose two new imputation methods that implement an Expectation Maximization algorithm to fit the SREG model. To evaluate the impact on SREG model parameter estimation of these proposed methods and other competing imputation methods available, we conducted two studies using different approaches. One study involved simulated data while the other used a real dataset. In both studies, different measures to verify whether the joint effect of G plus GE is altered by imputation of data, and the reproducibility of missing data were evaluated. We also incorporated situations not commonly addressed in the literature, such as non-random structures of missing values and a big data situation. The proposed procedures provided estimators with good performance, maintaining superiority in several aspects studied, even when the competing imputation methods did not achieve convergence. Therefore, the new methods enabled incomplete MET data to be effectively analyzed by a SREG model.</p>","PeriodicalId":11803,"journal":{"name":"Euphytica","volume":"53 1","pages":""},"PeriodicalIF":1.6000,"publicationDate":"2024-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Euphytica","FirstCategoryId":"97","ListUrlMain":"https://doi.org/10.1007/s10681-024-03344-z","RegionNum":3,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"AGRONOMY","Score":null,"Total":0}
引用次数: 0
Abstract
The site regression model (SREG) is utilized by plant breeders for analyzing multi-environment trials (MET) to examine the relationships among test environments, genotypes (G), and genotype-by-environment interactions (GE). SREG explores a matrix of G and GE by performing a singular value decomposition on the residuals matrix from a one-way ANOVA, requiring complete data. As missing values are common in MET, we propose two new imputation methods that implement an Expectation Maximization algorithm to fit the SREG model. To evaluate the impact on SREG model parameter estimation of these proposed methods and other competing imputation methods available, we conducted two studies using different approaches. One study involved simulated data while the other used a real dataset. In both studies, different measures to verify whether the joint effect of G plus GE is altered by imputation of data, and the reproducibility of missing data were evaluated. We also incorporated situations not commonly addressed in the literature, such as non-random structures of missing values and a big data situation. The proposed procedures provided estimators with good performance, maintaining superiority in several aspects studied, even when the competing imputation methods did not achieve convergence. Therefore, the new methods enabled incomplete MET data to be effectively analyzed by a SREG model.
植物育种人员利用场地回归模型(SREG)分析多环境试验(MET),以研究试验环境、基因型(G)和基因型与环境相互作用(GE)之间的关系。SREG 通过对单向方差分析的残差矩阵进行奇异值分解来探索 G 和 GE 矩阵,这需要完整的数据。由于缺失值在 MET 中很常见,我们提出了两种新的估算方法,采用期望最大化算法来拟合 SREG 模型。为了评估这些拟议方法和其他可用的竞争性估算方法对 SREG 模型参数估计的影响,我们使用不同的方法进行了两项研究。一项研究涉及模拟数据,另一项研究则使用真实数据集。在这两项研究中,我们采用了不同的方法来验证 G 加 GE 的联合效应是否会因数据估算而改变,并对缺失数据的可重复性进行了评估。我们还纳入了文献中不常见的情况,如缺失值的非随机结构和大数据情况。所提出的程序提供了性能良好的估算器,在所研究的几个方面都保持了优势,即使在竞争性估算方法没有达到收敛的情况下也是如此。因此,新方法使不完整的 MET 数据能够通过 SREG 模型得到有效分析。
期刊介绍:
Euphytica is an international journal on theoretical and applied aspects of plant breeding. It publishes critical reviews and papers on the results of original research related to plant breeding.
The integration of modern and traditional plant breeding is a growing field of research using transgenic crop plants and/or marker assisted breeding in combination with traditional breeding tools. The content should cover the interests of researchers directly or indirectly involved in plant breeding, at universities, breeding institutes, seed industries, plant biotech companies and industries using plant raw materials, and promote stability, adaptability and sustainability in agriculture and agro-industries.