Junjian Wang, Francesco Tiezzi, Yijian Huang, Christian Maltecca, Jicai Jiang
{"title":"猪数量性状基因组预测的前馈神经网络模型的基准测试。","authors":"Junjian Wang, Francesco Tiezzi, Yijian Huang, Christian Maltecca, Jicai Jiang","doi":"10.3389/fgene.2025.1618891","DOIUrl":null,"url":null,"abstract":"<p><p>Artificial neural networks are machine learning models that have been applied to various genomic problems, with the ability to learn non-linear relationships and model high-dimensional data. These advanced modeling capabilities make them promising candidates for genomic prediction by potentially capturing the intricate relationships between genetic variants and phenotypes. Despite these theoretical advantages, neural networks have shown inconsistent performance across previous genomic prediction research, and limited studies have evaluated their performance and feasibility specifically for pig genomic predictions using large-scale data. We evaluated the predictive performance of feed-forward neural network (FFNN) models implemented in TensorFlow with architectures ranging from single-layer (no hidden layers) to four-layer structures (three hidden layers). These FFNN models were compared with five linear methods, including GBLUP, LDAK-BOLT, BayesR, SLEMM-WW, and scikit-learn's ridge regression. The evaluation utilized data from six quantitative traits: off-test body weight (WT), off-test back fat thickness (BF), off-test loin muscle depth (MS), number of piglets born alive (NBA), number of piglets born dead (NBD), and number of piglets weaned (NW). We also assessed the computational efficiency of FFNN models on both CPU and GPU. The benchmarking employed repeated random subsampling validation with sample sizes ranging from 3,290 individuals for reproductive traits to over 26,000 individuals for production traits, using data from a total of 27,481 genotyped pigs. Hyperband tuning was used to optimize the hyper-parameters and select the best model for each structure. Results showed that FFNN models consistently underperformed compared to linear methods across all architectures tested. The one-layer structure yielded the best predictive accuracy among the FFNN approaches. Of the five linear methods, SLEMM-WW demonstrated the best balance of computational efficiency and predictive ability. GPUs offered significant computational efficiency gains for multi-layer FFNN models compared to CPUs, though FFNN models remained more computationally demanding than most linear methods. In conclusion, FFNN models with up to four layers did not improve genomic predictions compared to routine linear methods for pig quantitative traits.</p>","PeriodicalId":12750,"journal":{"name":"Frontiers in Genetics","volume":"16 ","pages":"1618891"},"PeriodicalIF":2.8000,"publicationDate":"2025-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12213717/pdf/","citationCount":"0","resultStr":"{\"title\":\"Benchmarking of feed-forward neural network models for genomic prediction of quantitative traits in pigs.\",\"authors\":\"Junjian Wang, Francesco Tiezzi, Yijian Huang, Christian Maltecca, Jicai Jiang\",\"doi\":\"10.3389/fgene.2025.1618891\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Artificial neural networks are machine learning models that have been applied to various genomic problems, with the ability to learn non-linear relationships and model high-dimensional data. These advanced modeling capabilities make them promising candidates for genomic prediction by potentially capturing the intricate relationships between genetic variants and phenotypes. Despite these theoretical advantages, neural networks have shown inconsistent performance across previous genomic prediction research, and limited studies have evaluated their performance and feasibility specifically for pig genomic predictions using large-scale data. We evaluated the predictive performance of feed-forward neural network (FFNN) models implemented in TensorFlow with architectures ranging from single-layer (no hidden layers) to four-layer structures (three hidden layers). These FFNN models were compared with five linear methods, including GBLUP, LDAK-BOLT, BayesR, SLEMM-WW, and scikit-learn's ridge regression. The evaluation utilized data from six quantitative traits: off-test body weight (WT), off-test back fat thickness (BF), off-test loin muscle depth (MS), number of piglets born alive (NBA), number of piglets born dead (NBD), and number of piglets weaned (NW). We also assessed the computational efficiency of FFNN models on both CPU and GPU. The benchmarking employed repeated random subsampling validation with sample sizes ranging from 3,290 individuals for reproductive traits to over 26,000 individuals for production traits, using data from a total of 27,481 genotyped pigs. Hyperband tuning was used to optimize the hyper-parameters and select the best model for each structure. Results showed that FFNN models consistently underperformed compared to linear methods across all architectures tested. The one-layer structure yielded the best predictive accuracy among the FFNN approaches. Of the five linear methods, SLEMM-WW demonstrated the best balance of computational efficiency and predictive ability. GPUs offered significant computational efficiency gains for multi-layer FFNN models compared to CPUs, though FFNN models remained more computationally demanding than most linear methods. In conclusion, FFNN models with up to four layers did not improve genomic predictions compared to routine linear methods for pig quantitative traits.</p>\",\"PeriodicalId\":12750,\"journal\":{\"name\":\"Frontiers in Genetics\",\"volume\":\"16 \",\"pages\":\"1618891\"},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2025-06-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12213717/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Frontiers in Genetics\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.3389/fgene.2025.1618891\",\"RegionNum\":3,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q2\",\"JCRName\":\"GENETICS & HEREDITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Genetics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.3389/fgene.2025.1618891","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
Benchmarking of feed-forward neural network models for genomic prediction of quantitative traits in pigs.
Artificial neural networks are machine learning models that have been applied to various genomic problems, with the ability to learn non-linear relationships and model high-dimensional data. These advanced modeling capabilities make them promising candidates for genomic prediction by potentially capturing the intricate relationships between genetic variants and phenotypes. Despite these theoretical advantages, neural networks have shown inconsistent performance across previous genomic prediction research, and limited studies have evaluated their performance and feasibility specifically for pig genomic predictions using large-scale data. We evaluated the predictive performance of feed-forward neural network (FFNN) models implemented in TensorFlow with architectures ranging from single-layer (no hidden layers) to four-layer structures (three hidden layers). These FFNN models were compared with five linear methods, including GBLUP, LDAK-BOLT, BayesR, SLEMM-WW, and scikit-learn's ridge regression. The evaluation utilized data from six quantitative traits: off-test body weight (WT), off-test back fat thickness (BF), off-test loin muscle depth (MS), number of piglets born alive (NBA), number of piglets born dead (NBD), and number of piglets weaned (NW). We also assessed the computational efficiency of FFNN models on both CPU and GPU. The benchmarking employed repeated random subsampling validation with sample sizes ranging from 3,290 individuals for reproductive traits to over 26,000 individuals for production traits, using data from a total of 27,481 genotyped pigs. Hyperband tuning was used to optimize the hyper-parameters and select the best model for each structure. Results showed that FFNN models consistently underperformed compared to linear methods across all architectures tested. The one-layer structure yielded the best predictive accuracy among the FFNN approaches. Of the five linear methods, SLEMM-WW demonstrated the best balance of computational efficiency and predictive ability. GPUs offered significant computational efficiency gains for multi-layer FFNN models compared to CPUs, though FFNN models remained more computationally demanding than most linear methods. In conclusion, FFNN models with up to four layers did not improve genomic predictions compared to routine linear methods for pig quantitative traits.
Frontiers in GeneticsBiochemistry, Genetics and Molecular Biology-Molecular Medicine
CiteScore
5.50
自引率
8.10%
发文量
3491
审稿时长
14 weeks
期刊介绍:
Frontiers in Genetics publishes rigorously peer-reviewed research on genes and genomes relating to all the domains of life, from humans to plants to livestock and other model organisms. Led by an outstanding Editorial Board of the world’s leading experts, this multidisciplinary, open-access journal is at the forefront of communicating cutting-edge research to researchers, academics, clinicians, policy makers and the public.
The study of inheritance and the impact of the genome on various biological processes is well documented. However, the majority of discoveries are still to come. A new era is seeing major developments in the function and variability of the genome, the use of genetic and genomic tools and the analysis of the genetic basis of various biological phenomena.