利用机器学习和部分依赖性评估表型值最佳线性无偏预测（BLUP）的稳健性。

IF 2 3区生物学 Q3 BIOTECHNOLOGY & APPLIED MICROBIOLOGY

Journal of Applied Genetics Pub Date : 2024-05-01 Epub Date: 2024-01-03 DOI:10.1007/s13353-023-00815-2

Prashant Bhandari, Tong Geon Lee

{"title":"利用机器学习和部分依赖性评估表型值最佳线性无偏预测（BLUP）的稳健性。","authors":"Prashant Bhandari, Tong Geon Lee","doi":"10.1007/s13353-023-00815-2","DOIUrl":null,"url":null,"abstract":"Best linear unbiased prediction (BLUP) is widely used in plant research to address experimental variation. For phenotypic values, BLUP accuracy is largely dependent on properly controlled experimental repetition and how variable components are outlined in the model. Thus, determining BLUP robustness implies the need to evaluate contributions from each repetition. Here, we assessed the robustness of BLUP values for simulated or empirical phenotypic datasets, where the BLUP value and each experimental repetition served as dependent and independent (feature) variables, respectively. Our technique incorporated machine learning and partial dependence. First, we compared the feature importance estimated with the neural networks. Second, we compared estimated average marginal effects of individual repetitions, calculated with a partial dependence analysis. We showed that contributions of experimental repetitions are unequal in a phenotypic dataset, suggesting that the calculated BLUP value is likely to be influenced by some repetitions more than others (such as failing to detect simulated true positive associations). To resolve disproportionate sources, variable components in the BLUP model must be further outlined.","PeriodicalId":14891,"journal":{"name":"Journal of Applied Genetics","volume":" ","pages":"283-286"},"PeriodicalIF":2.0000,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Using machine learning and partial dependence to evaluate robustness of best linear unbiased prediction (BLUP) for phenotypic values.\",\"authors\":\"Prashant Bhandari, Tong Geon Lee\",\"doi\":\"10.1007/s13353-023-00815-2\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Best linear unbiased prediction (BLUP) is widely used in plant research to address experimental variation. For phenotypic values, BLUP accuracy is largely dependent on properly controlled experimental repetition and how variable components are outlined in the model. Thus, determining BLUP robustness implies the need to evaluate contributions from each repetition. Here, we assessed the robustness of BLUP values for simulated or empirical phenotypic datasets, where the BLUP value and each experimental repetition served as dependent and independent (feature) variables, respectively. Our technique incorporated machine learning and partial dependence. First, we compared the feature importance estimated with the neural networks. Second, we compared estimated average marginal effects of individual repetitions, calculated with a partial dependence analysis. We showed that contributions of experimental repetitions are unequal in a phenotypic dataset, suggesting that the calculated BLUP value is likely to be influenced by some repetitions more than others (such as failing to detect simulated true positive associations). To resolve disproportionate sources, variable components in the BLUP model must be further outlined.\",\"PeriodicalId\":14891,\"journal\":{\"name\":\"Journal of Applied Genetics\",\"volume\":\" \",\"pages\":\"283-286\"},\"PeriodicalIF\":2.0000,\"publicationDate\":\"2024-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Applied Genetics\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1007/s13353-023-00815-2\",\"RegionNum\":3,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/1/3 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q3\",\"JCRName\":\"BIOTECHNOLOGY & APPLIED MICROBIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Applied Genetics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1007/s13353-023-00815-2","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/3 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"BIOTECHNOLOGY & APPLIED MICROBIOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

最佳线性无偏预测（BLUP）被广泛应用于植物研究，以解决实验变异问题。对于表型值而言，最佳线性无偏预测的准确性在很大程度上取决于对实验重复的适当控制以及如何在模型中概述变量成分。因此，确定 BLUP 的稳健性意味着需要评估每次重复的贡献。在这里，我们评估了模拟或经验表型数据集的 BLUP 值的稳健性，其中 BLUP 值和每次实验重复分别作为因变量和自变量（特征）。我们的技术结合了机器学习和部分依赖性。首先，我们比较了用神经网络估计的特征重要性。其次，我们比较了通过部分依赖分析计算出的各个重复的平均边际效应。我们发现，在表型数据集中，实验重复的贡献是不平等的，这表明计算出的 BLUP 值很可能受某些重复的影响大于其他重复（如无法检测到模拟的真阳性关联）。为了解决比例失调的问题，必须进一步概述 BLUP 模型中的变量成分。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Using machine learning and partial dependence to evaluate robustness of best linear unbiased prediction (BLUP) for phenotypic values.

Best linear unbiased prediction (BLUP) is widely used in plant research to address experimental variation. For phenotypic values, BLUP accuracy is largely dependent on properly controlled experimental repetition and how variable components are outlined in the model. Thus, determining BLUP robustness implies the need to evaluate contributions from each repetition. Here, we assessed the robustness of BLUP values for simulated or empirical phenotypic datasets, where the BLUP value and each experimental repetition served as dependent and independent (feature) variables, respectively. Our technique incorporated machine learning and partial dependence. First, we compared the feature importance estimated with the neural networks. Second, we compared estimated average marginal effects of individual repetitions, calculated with a partial dependence analysis. We showed that contributions of experimental repetitions are unequal in a phenotypic dataset, suggesting that the calculated BLUP value is likely to be influenced by some repetitions more than others (such as failing to detect simulated true positive associations). To resolve disproportionate sources, variable components in the BLUP model must be further outlined.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Applied Genetics 生物-生物工程与应用微生物

CiteScore

4.30

自引率

4.20%

发文量

审稿时长

6-12 weeks

期刊介绍： The Journal of Applied Genetics is an international journal on genetics and genomics. It publishes peer-reviewed original papers, short communications (including case reports) and review articles focused on the research of applicative aspects of plant, human, animal and microbial genetics and genomics.