Bogong Liu , Huichao Liu , Junhao Tu , Jian Xiao , Jie Yang , Xi He , Haihan Zhang
{"title":"An investigation of machine learning methods applied to genomic prediction in yellow-feathered broilers","authors":"Bogong Liu , Huichao Liu , Junhao Tu , Jian Xiao , Jie Yang , Xi He , Haihan Zhang","doi":"10.1016/j.psj.2024.104489","DOIUrl":null,"url":null,"abstract":"<div><div>Machine learning (ML) methods have rapidly developed in various theoretical and practical research areas, including predicting genomic breeding values for large livestock animals. However, few studies have investigated the application of ML in broiler breeding. In this study, seven different ML methods—support vector regression (SVR), random forest (RF), gradient boosting decision tree (GBDT), extreme gradient boosting (XGBoost), light gradient boosting machine (LightGBM), kernel ridge regression (KRR) and multilayer perceptron (MLP) were employed to predict the genomic breeding values of laying traits, growth and carcass traits in a yellow-feathered broiler breeding population. The results indicated that classic methods, such as GBLUP and Bayesian, achieved superior prediction accuracy compared to ML methods in five of the eight traits. For half-eviscerated weight (HEW), ML methods showed an average improvement of 54.4% over GBLUP and Bayesian methods. Among the ML methods, SVR, RF, GBDT, and XGBoost exhibited improvements exceeding 60%, with respective values of 61.3%, 61.0%, 60.4%, and 60.7%; while MLP improved by 54.4% and LightGBM by 53.7%, KRR had the lowest improvement at 29.4%. For eviscerated weight (EW), ML methods still outperformed GBLUP and Bayesian methods. MLP gained the largest improvement at 19.0%, while SVR, RF, GBDT, XGBoost, LightGBM, and KRR improved by 15.0%, 16.5%, 9.5%, 7.0%, 1.6%, and 15.9%, respectively. Compared to default hyperparameters, the average improvement of ML methods with tuned hyperparameters was 34.0%, 32.9%, 27.0%, 19.3%, 26.8%, 13.2%, 18.9%, and 46.3%, respectively. The prediction accuracy of above algorithms could be optimized using genome-wide association study (GWAS) to select subsets of significant SNPs. This work provides valuable insights into genomic prediction, aiding genetic breeding in broilers.</div></div>","PeriodicalId":20459,"journal":{"name":"Poultry Science","volume":"104 1","pages":"Article 104489"},"PeriodicalIF":3.8000,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Poultry Science","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0032579124010678","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURE, DAIRY & ANIMAL SCIENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Machine learning (ML) methods have rapidly developed in various theoretical and practical research areas, including predicting genomic breeding values for large livestock animals. However, few studies have investigated the application of ML in broiler breeding. In this study, seven different ML methods—support vector regression (SVR), random forest (RF), gradient boosting decision tree (GBDT), extreme gradient boosting (XGBoost), light gradient boosting machine (LightGBM), kernel ridge regression (KRR) and multilayer perceptron (MLP) were employed to predict the genomic breeding values of laying traits, growth and carcass traits in a yellow-feathered broiler breeding population. The results indicated that classic methods, such as GBLUP and Bayesian, achieved superior prediction accuracy compared to ML methods in five of the eight traits. For half-eviscerated weight (HEW), ML methods showed an average improvement of 54.4% over GBLUP and Bayesian methods. Among the ML methods, SVR, RF, GBDT, and XGBoost exhibited improvements exceeding 60%, with respective values of 61.3%, 61.0%, 60.4%, and 60.7%; while MLP improved by 54.4% and LightGBM by 53.7%, KRR had the lowest improvement at 29.4%. For eviscerated weight (EW), ML methods still outperformed GBLUP and Bayesian methods. MLP gained the largest improvement at 19.0%, while SVR, RF, GBDT, XGBoost, LightGBM, and KRR improved by 15.0%, 16.5%, 9.5%, 7.0%, 1.6%, and 15.9%, respectively. Compared to default hyperparameters, the average improvement of ML methods with tuned hyperparameters was 34.0%, 32.9%, 27.0%, 19.3%, 26.8%, 13.2%, 18.9%, and 46.3%, respectively. The prediction accuracy of above algorithms could be optimized using genome-wide association study (GWAS) to select subsets of significant SNPs. This work provides valuable insights into genomic prediction, aiding genetic breeding in broilers.
期刊介绍:
First self-published in 1921, Poultry Science is an internationally renowned monthly journal, known as the authoritative source for a broad range of poultry information and high-caliber research. The journal plays a pivotal role in the dissemination of preeminent poultry-related knowledge across all disciplines. As of January 2020, Poultry Science will become an Open Access journal with no subscription charges, meaning authors who publish here can make their research immediately, permanently, and freely accessible worldwide while retaining copyright to their work. Papers submitted for publication after October 1, 2019 will be published as Open Access papers.
An international journal, Poultry Science publishes original papers, research notes, symposium papers, and reviews of basic science as applied to poultry. This authoritative source of poultry information is consistently ranked by ISI Impact Factor as one of the top 10 agriculture, dairy and animal science journals to deliver high-caliber research. Currently it is the highest-ranked (by Impact Factor and Eigenfactor) journal dedicated to publishing poultry research. Subject areas include breeding, genetics, education, production, management, environment, health, behavior, welfare, immunology, molecular biology, metabolism, nutrition, physiology, reproduction, processing, and products.