Ensemble of Bayesian alphabets via constraint weight optimization strategy improves genomic prediction accuracy.

IF 2.2 3区生物学 Q3 GENETICS & HEREDITY

G3: Genes|Genomes|Genetics Pub Date : 2025-07-29 DOI:10.1093/g3journal/jkaf150

Prabina Kumar Meher, Upendra Kumar Pradhan, Mrinmoy Ray, Ajit Gupta, Rajender Parsad, Pushpendra Kumar Gupta

{"title":"Ensemble of Bayesian alphabets via constraint weight optimization strategy improves genomic prediction accuracy.","authors":"Prabina Kumar Meher, Upendra Kumar Pradhan, Mrinmoy Ray, Ajit Gupta, Rajender Parsad, Pushpendra Kumar Gupta","doi":"10.1093/g3journal/jkaf150","DOIUrl":null,"url":null,"abstract":"<p><p>This study proposes a weight optimization-based ensemble framework aimed at improving genomic prediction accuracy. It incorporates 8 Bayesian models-BayesA, BayesB, BayesC, BayesBpi, BayesCpi, BayesR, BayesL, and BayesRR in the ensemble framework, where the weight assigned to each model was optimized using genetic algorithm method. The performance of the ensemble model, named EnBayes, was evaluated on 18 datasets from 4 crop species, showing improved prediction accuracy compared to individual Bayesian models. New objective functions were proposed to improve prediction accuracy in terms of both Pearson's correlation coefficient and mean square error. The accuracy of the ensemble model was found to be associated with the number of models considered in the framework, where a few more accurate models achieved similar accuracy as that of more number of less accurate models. Additionally, over-bias and under-bias models also influenced the biasness of the ensemble model's accuracy. The study also explored a meta-learning approach using Bayesian models as base learners and random forest, quantile regression forest, and ridge regression as meta-learners, with the EnBayes model outperforming this approach. While traditional genomic prediction models GBLUP and rrBLUP and machine learning models support vector machine, random forest, extreme gradient boosting, and light gradient boosting were included in the ensemble framework in addition to Bayesian models, the ensemble model achieved higher accuracy as compared to the individual Bayesian, BLUP, and machine learning models. We believe that EnBayes would contribute significantly to ongoing efforts on improving genomic prediction accuracy.</p>","PeriodicalId":12468,"journal":{"name":"G3: Genes|Genomes|Genetics","volume":" ","pages":""},"PeriodicalIF":2.2000,"publicationDate":"2025-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"G3: Genes|Genomes|Genetics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/g3journal/jkaf150","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}

引用次数: 0

Abstract

This study proposes a weight optimization-based ensemble framework aimed at improving genomic prediction accuracy. It incorporates 8 Bayesian models-BayesA, BayesB, BayesC, BayesBpi, BayesCpi, BayesR, BayesL, and BayesRR in the ensemble framework, where the weight assigned to each model was optimized using genetic algorithm method. The performance of the ensemble model, named EnBayes, was evaluated on 18 datasets from 4 crop species, showing improved prediction accuracy compared to individual Bayesian models. New objective functions were proposed to improve prediction accuracy in terms of both Pearson's correlation coefficient and mean square error. The accuracy of the ensemble model was found to be associated with the number of models considered in the framework, where a few more accurate models achieved similar accuracy as that of more number of less accurate models. Additionally, over-bias and under-bias models also influenced the biasness of the ensemble model's accuracy. The study also explored a meta-learning approach using Bayesian models as base learners and random forest, quantile regression forest, and ridge regression as meta-learners, with the EnBayes model outperforming this approach. While traditional genomic prediction models GBLUP and rrBLUP and machine learning models support vector machine, random forest, extreme gradient boosting, and light gradient boosting were included in the ensemble framework in addition to Bayesian models, the ensemble model achieved higher accuracy as compared to the individual Bayesian, BLUP, and machine learning models. We believe that EnBayes would contribute significantly to ongoing efforts on improving genomic prediction accuracy.

查看原文本刊更多论文

基于约束权优化策略的贝叶斯字母表集成提高了基因组预测的准确性。

本研究提出了一种基于权重优化的集成框架，旨在提高基因组预测的准确性。该算法将bayesa、BayesB、BayesC、BayesBpi、BayesCpi、BayesR、BayesL和BayesRR 8个贝叶斯模型整合到集成框架中，并利用遗传算法对每个模型的权重进行优化。在来自4种作物的18个数据集上对集成模型的性能进行了评估，结果显示，与单个贝叶斯模型相比，集成模型的预测精度有所提高。提出了新的目标函数，从Pearson相关系数和均方误差两方面提高预测精度。研究发现，集成模型的精度与框架中考虑的模型数量有关，其中一些更精确的模型与更多更不准确的模型获得的精度相似。此外，过偏和欠偏模型也会影响集成模型精度的偏度。该研究还探索了一种元学习方法，使用贝叶斯模型作为基础学习器，使用随机森林、分位数回归森林和山脊回归作为元学习器，其中EnBayes模型的表现优于该方法。在集成框架中，除了贝叶斯模型外，还包括传统的基因组预测模型GBLUP和rrBLUP，以及机器学习模型支持向量机、随机森林、极端梯度增强和轻梯度增强，集成模型比单个贝叶斯、BLUP和机器学习模型具有更高的精度。我们相信，EnBayes将对正在进行的提高基因组预测准确性的努力做出重大贡献。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

G3: Genes|Genomes|Genetics GENETICS & HEREDITY-

CiteScore

5.10

自引率

3.80%

发文量

305

审稿时长

3-8 weeks

期刊介绍： G3: Genes, Genomes, Genetics provides a forum for the publication of high‐quality foundational research, particularly research that generates useful genetic and genomic information such as genome maps, single gene studies, genome‐wide association and QTL studies, as well as genome reports, mutant screens, and advances in methods and technology. The Editorial Board of G3 believes that rapid dissemination of these data is the necessary foundation for analysis that leads to mechanistic insights. G3, published by the Genetics Society of America, meets the critical and growing need of the genetics community for rapid review and publication of important results in all areas of genetics. G3 offers the opportunity to publish the puzzling finding or to present unpublished results that may not have been submitted for review and publication due to a perceived lack of a potential high-impact finding. G3 has earned the DOAJ Seal, which is a mark of certification for open access journals, awarded by DOAJ to journals that achieve a high level of openness, adhere to Best Practice and high publishing standards.