Chen Dong, Wang Shu-Jie, Zhao Zhen-Jian, Ji Xiang, Shen Qi, Yu Yang, Cui Sheng-di, Wang Jun-Ge, Chen Zi-Yang, Wang Jin-Yong, Guo Zong-Yi, Wu Ping-Xian, Tang Guo-Qing
{"title":"基于机器学习的猪生长性状基因组预测。","authors":"Chen Dong, Wang Shu-Jie, Zhao Zhen-Jian, Ji Xiang, Shen Qi, Yu Yang, Cui Sheng-di, Wang Jun-Ge, Chen Zi-Yang, Wang Jin-Yong, Guo Zong-Yi, Wu Ping-Xian, Tang Guo-Qing","doi":"10.16288/j.yczz.23-120","DOIUrl":null,"url":null,"abstract":"<p><p>This study aimed to assess and compare the performance of different machine learning models in predicting selected pig growth traits and genomic estimated breeding values (GEBV) using automated machine learning, with the goal of optimizing whole-genome evaluation methods in pig breeding. The research employed genomic information, pedigree matrices, fixed effects, and phenotype data from 9968 pigs across multiple companies to derive four optimal machine learning models: deep learning (DL), random forest (RF), gradient boosting machine (GBM), and extreme gradient boosting (XGB). Through 10-fold cross-validation, predictions were made for GEBV and phenotypes of pigs reaching weight milestones (100 kg and 115 kg) with adjustments for backfat and days to weight. The findings indicated that machine learning models exhibited higher accuracy in predicting GEBV compared to phenotypic traits. Notably, GBM demonstrated superior GEBV prediction accuracy, with values of 0.683, 0.710, 0.866, and 0.871 for B100, B115, D100, and D115, respectively, slightly outperforming other methods. In phenotype prediction, GBM emerged as the best-performing model for pigs with B100, B115, D100, and D115 traits, achieving prediction accuracies of 0.547, followed by DL at 0.547, and then XGB with accuracies of 0.672 and 0.670. In terms of model training time, RF required the most time, while GBM and DL fell in between, and XGB demonstrated the shortest training time. In summary, machine learning models obtained through automated techniques exhibited higher GEBV prediction accuracy compared to phenotypic traits. GBM emerged as the overall top performer in terms of prediction accuracy and training time efficiency, while XGB demonstrated the ability to train accurate prediction models within a short timeframe. RF, on the other hand, had longer training times and insufficient accuracy, rendering it unsuitable for predicting pig growth traits and GEBV.</p>","PeriodicalId":35536,"journal":{"name":"遗传","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Genomic prediction of pig growth traits based on machine learning.\",\"authors\":\"Chen Dong, Wang Shu-Jie, Zhao Zhen-Jian, Ji Xiang, Shen Qi, Yu Yang, Cui Sheng-di, Wang Jun-Ge, Chen Zi-Yang, Wang Jin-Yong, Guo Zong-Yi, Wu Ping-Xian, Tang Guo-Qing\",\"doi\":\"10.16288/j.yczz.23-120\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>This study aimed to assess and compare the performance of different machine learning models in predicting selected pig growth traits and genomic estimated breeding values (GEBV) using automated machine learning, with the goal of optimizing whole-genome evaluation methods in pig breeding. The research employed genomic information, pedigree matrices, fixed effects, and phenotype data from 9968 pigs across multiple companies to derive four optimal machine learning models: deep learning (DL), random forest (RF), gradient boosting machine (GBM), and extreme gradient boosting (XGB). Through 10-fold cross-validation, predictions were made for GEBV and phenotypes of pigs reaching weight milestones (100 kg and 115 kg) with adjustments for backfat and days to weight. The findings indicated that machine learning models exhibited higher accuracy in predicting GEBV compared to phenotypic traits. Notably, GBM demonstrated superior GEBV prediction accuracy, with values of 0.683, 0.710, 0.866, and 0.871 for B100, B115, D100, and D115, respectively, slightly outperforming other methods. In phenotype prediction, GBM emerged as the best-performing model for pigs with B100, B115, D100, and D115 traits, achieving prediction accuracies of 0.547, followed by DL at 0.547, and then XGB with accuracies of 0.672 and 0.670. In terms of model training time, RF required the most time, while GBM and DL fell in between, and XGB demonstrated the shortest training time. In summary, machine learning models obtained through automated techniques exhibited higher GEBV prediction accuracy compared to phenotypic traits. GBM emerged as the overall top performer in terms of prediction accuracy and training time efficiency, while XGB demonstrated the ability to train accurate prediction models within a short timeframe. RF, on the other hand, had longer training times and insufficient accuracy, rendering it unsuitable for predicting pig growth traits and GEBV.</p>\",\"PeriodicalId\":35536,\"journal\":{\"name\":\"遗传\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-10-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"遗传\",\"FirstCategoryId\":\"1091\",\"ListUrlMain\":\"https://doi.org/10.16288/j.yczz.23-120\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Medicine\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"遗传","FirstCategoryId":"1091","ListUrlMain":"https://doi.org/10.16288/j.yczz.23-120","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Medicine","Score":null,"Total":0}
Genomic prediction of pig growth traits based on machine learning.
This study aimed to assess and compare the performance of different machine learning models in predicting selected pig growth traits and genomic estimated breeding values (GEBV) using automated machine learning, with the goal of optimizing whole-genome evaluation methods in pig breeding. The research employed genomic information, pedigree matrices, fixed effects, and phenotype data from 9968 pigs across multiple companies to derive four optimal machine learning models: deep learning (DL), random forest (RF), gradient boosting machine (GBM), and extreme gradient boosting (XGB). Through 10-fold cross-validation, predictions were made for GEBV and phenotypes of pigs reaching weight milestones (100 kg and 115 kg) with adjustments for backfat and days to weight. The findings indicated that machine learning models exhibited higher accuracy in predicting GEBV compared to phenotypic traits. Notably, GBM demonstrated superior GEBV prediction accuracy, with values of 0.683, 0.710, 0.866, and 0.871 for B100, B115, D100, and D115, respectively, slightly outperforming other methods. In phenotype prediction, GBM emerged as the best-performing model for pigs with B100, B115, D100, and D115 traits, achieving prediction accuracies of 0.547, followed by DL at 0.547, and then XGB with accuracies of 0.672 and 0.670. In terms of model training time, RF required the most time, while GBM and DL fell in between, and XGB demonstrated the shortest training time. In summary, machine learning models obtained through automated techniques exhibited higher GEBV prediction accuracy compared to phenotypic traits. GBM emerged as the overall top performer in terms of prediction accuracy and training time efficiency, while XGB demonstrated the ability to train accurate prediction models within a short timeframe. RF, on the other hand, had longer training times and insufficient accuracy, rendering it unsuitable for predicting pig growth traits and GEBV.