Christina M. Rochus, Marije J. Steensma, Marco C. A. M. Bink, Abe E. Huisman, Barbara Harlizius, Martijn F. L. Derks, Richard P. M. A. Crooijmans, Bart J. Ducro, Piter Bijma, Martien A. M. Groenen, Han A. Mulder
{"title":"Estimating mutation rate and characterising single nucleotide de novo mutations in pigs","authors":"Christina M. Rochus, Marije J. Steensma, Marco C. A. M. Bink, Abe E. Huisman, Barbara Harlizius, Martijn F. L. Derks, Richard P. M. A. Crooijmans, Bart J. Ducro, Piter Bijma, Martien A. M. Groenen, Han A. Mulder","doi":"10.1186/s12711-025-00967-1","DOIUrl":"https://doi.org/10.1186/s12711-025-00967-1","url":null,"abstract":"Direct estimates of mutation rates in humans have changed our understanding of evolutionary timing and de novo mutations (DNM) have been associated with several developmental disorders in humans. Livestock species, including pigs, can contribute to the study of DNM because of their ideal population structure and routine phenotype collection. In principle, there is the potential for livestock populations to quickly accumulate new genetic variants because of short generation intervals and high selection intensity. However, the impact of DNM on the fitness of individuals is not known and with current genomic selection programs they cannot contribute to estimated breeding values. The aims of our project were to detect and validate single nucleotide DNM in two commercial pig breeding lines, estimate the single nucleotide mutation rate, and characterise DNM. We sequenced (150 bp paired end reads, 30X coverage) 46 pig trios from two commercial lines. Single nucleotide DNM were detected using a trio-aware method. We defined candidate DNM as single nucleotide variants (SNVs) found in heterozygous state in trio-offspring with both trio-parents homozygous for the reference allele. In this study, we estimate a lower threshold of the DNM rate in pigs of 6.3 × 10–9 per site per gamete. Our findings are consistent with those from other mammals and those published for a small number of livestock species. Most DNM we detected were in introns (47%) and intergenic regions (49%). The mutational spectrum in pigs differs from that in humans and we found several DNM predicted to have an effect on animal’s fitness based on the base pair change and their location in the genome. With this study, we have generated fundamental knowledge on mutation rate in a non-primate species and identified DNM that could have an impact on the fitness of individuals.","PeriodicalId":55120,"journal":{"name":"Genetics Selection Evolution","volume":"6 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2025-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143827681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Setegn Worku Alemu, Thomas J. Lopdell, Alexander J. Trevarton, Russell G. Snell, Mathew D. Littlejohn, Dorian J. Garrick
{"title":"Comparison of genomic prediction accuracies in dairy cattle lactation traits using five classes of functional variants versus generic SNP","authors":"Setegn Worku Alemu, Thomas J. Lopdell, Alexander J. Trevarton, Russell G. Snell, Mathew D. Littlejohn, Dorian J. Garrick","doi":"10.1186/s12711-025-00966-2","DOIUrl":"https://doi.org/10.1186/s12711-025-00966-2","url":null,"abstract":"Genomic selection, typically employing genetic markers from SNP chips, is routine in modern dairy cattle breeding. This study assessed the impact of functional sequence variants on genomic prediction accuracy relative to 50 k SNP chip markers for fat percent, protein percent, milk volume, fat yield, and protein yield in lactating dairy cattle. The functional variants were identified through GWAS, RNA-seq, Histone modification ChIP-seq, ATAC-seq, or were coding variants. The genomic prediction accuracy obtained using each class of functional variants was compared with matched numbers of SNPs randomly selected from the Illumina 50 k SNP chip. The investigation revealed that variants identified by GWAS or RNA-seq, significantly improved the prediction accuracy across all five traits. Contributions from ChIP-seq, ATAC-seq, and coding variants varied. Some variants identified using ChIP-seq showed marked improvements, while others reduced accuracy in protein yield predictions. Relative to a matched number of 32,595 SNPs from the SNP chip, pooling all the functional variants demonstrated prediction accuracy increases of 1.76% for fat percent, 2.97% for protein percent, 0.51% for milk volume, and 0.26% for fat yield, but with a slight decrease of 0.43% in protein yield. The study demonstrates that functional variants can improve prediction accuracy relative to equivalent numbers of variants from a generic SNP panel, with percent traits showing more significant gains than yield traits. The main advantage of using functional variants for genomic prediction was achievement of comparable accuracy using a smaller, more selective set of loci. This is particularly evident in trait-specific scenarios. Our findings indicate that specific combinations of functional variants comprising 16 k variants can achieve genomic prediction accuracy comparable to employing a standard panel of twice the size (32.6 k), especially for percent traits. This highlights the potential for the development of more efficient, trait-focused SNP panels utilizing functional variants.","PeriodicalId":55120,"journal":{"name":"Genetics Selection Evolution","volume":"39 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2025-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143819556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Joseph L. Matt, Jessica Moss Small, Peter D. Kube, Standish K. Allen
{"title":"Quantitative genetic analysis of late spring mortality in triploid Crassostrea virginica","authors":"Joseph L. Matt, Jessica Moss Small, Peter D. Kube, Standish K. Allen","doi":"10.1186/s12711-025-00965-3","DOIUrl":"https://doi.org/10.1186/s12711-025-00965-3","url":null,"abstract":"Triploid oysters, bred by crossing tetraploid and diploid oysters, are common worldwide in commercial oyster aquaculture and make up much of the hatchery-produced Crassostrea virginica farmed in the mid-Atlantic and southeast of the United States. Breeding diploid and tetraploid animals for genetic improvement of triploid progeny is unique to oysters and can proceed via several possible breeding strategies. Triploid oysters, along with their diploid or tetraploid relatives, have yet been subject to quantitative genetic analyses that could inform a breeding strategy of triploid improvement. The importance of quantitative genetic analyses involving triploid C. virginica has been emphasized by the occurrence of mortality events of near-market sized triploids in late spring. Genetic parameters for survival and weight of triploid and tetraploid C. virginica were estimated from twenty paternal half-sib triploid families and thirty-nine full-sib tetraploid families reared at three sites in the Chesapeake Bay (USA). Traits were analyzed using linear mixed models in ASReml-R. Genetic relationship matrices appropriate for pedigrees with triploid and tetraploid animals were produced using the polyAinv package in R. A mortality event in triploids occurred at one site located on the bayside of the Eastern Shore of Virginia. Between early May and early July, three triploid families had survival of less than 0.70, while most had survival greater than 0.90. The heritability for survival during this period in triploids at this affected site was 0.57 ± 0.23. Triploid survival at the affected site was adversely related to triploid survival at the low salinity site (− 0.50 ± 0.23) and unrelated to tetraploid survival at the site with similar salinity (0.05 ± 0.39). Survival during a late spring mortality event in triploids had a substantial additive genetic basis, suggesting selective breeding of tetraploids can reduce triploid mortalities. Genetic correlations revealed evidence of genotype by environment interactions for triploid survival and weak genetic correlations between survival of tetraploids and triploids. A selective breeding strategy with phenotyping of tetraploid and triploid half-sibs is recommended for genetic improvement of triploid oysters.","PeriodicalId":55120,"journal":{"name":"Genetics Selection Evolution","volume":"108 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2025-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143813775","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Analysis of different genotyping and selection strategies in laying hen breeding programs","authors":"Lisa Büttgen, Henner Simianer, Torsten Pook","doi":"10.1186/s12711-025-00948-4","DOIUrl":"https://doi.org/10.1186/s12711-025-00948-4","url":null,"abstract":"Genomic selection has become an integral component of modern animal breeding programs, having the potential to improve the efficiency of layer breeding programs both by obtaining higher prediction accuracies and reducing the generation interval, particularly for males, who cannot be phenotyped for sex-limited traits such as laying performance. In the current study, we investigate different strategies to reduce the generation interval either for both sexes or only for the male side of the breeding scheme based on stochastic simulation using the software MoBPS. Additionally, prediction accuracies based on varying proportions of genotyping and phenotype- and pedigree-based selection as well as genomic breeding values are compared. Selection of hens based on estimated breeding values, either pedigree-based or genomic, increased genetic gain compared to selection based on phenotypes only. The use of two time-shifted subpopulations with exchange of males between subpopulations to reduce the generation interval on the male side led to significantly higher genetic gains. Reducing the generation interval for both males and females was only efficient when population sizes were maintained, which result in doubling of the number of females to genotype and phenotype within the same time frame compared to the scenarios with the longer generation intervals. Although substantially higher gains were obtained by in particular pedigree-based selection of females and a reduction of generation intervals this led to substantially greater rates of inbreeding per year. The use of a genomic relationship matrix in breeding value estimation instead of a pedigree-based relationship matrix not only increased genetic gains but also reduced inbreeding rates. The use of optimum contribution selection led to basically the same genetic gains as without it but reduced inbreeding rates. However, overall differences obtained with optimal contribution selection were small compared to differences caused by the other effects that were considered. The reduction of the generation interval on the male side by the use of genomic estimated breeding values was highly beneficial. Reduction of the generation interval on the female side was only beneficial when a high proportion of hens was genotyped and housing capacities were increased. On the female side of a layer breeding program, selection based on pedigree-based estimated breeding values was inferior to phenotypic selection, as it resulted in a substantial increase in inbreeding rates.","PeriodicalId":55120,"journal":{"name":"Genetics Selection Evolution","volume":"6 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2025-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143790194","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Christopher M. Pooley, Glenn Marion, Jamie Prentice, Ricardo Pong-Wong, Stephen C. Bishop, Andrea Doeschl-Wilson
{"title":"SIRE 2.0: a novel method for estimating polygenic host effects underlying infectious disease transmission, and analytical expressions for prediction accuracies","authors":"Christopher M. Pooley, Glenn Marion, Jamie Prentice, Ricardo Pong-Wong, Stephen C. Bishop, Andrea Doeschl-Wilson","doi":"10.1186/s12711-025-00956-4","DOIUrl":"https://doi.org/10.1186/s12711-025-00956-4","url":null,"abstract":"Genetic selection of individuals that are less susceptible to infection, less infectious once infected, and recover faster, offers an effective and long-lasting solution to reduce the incidence and impact of infectious diseases in farmed animals. However, computational methods for simultaneously estimating genetic parameters for host susceptibility, infectivity and recoverability from real-word data have been lacking. Our previously developed methodology and software tool SIRE 1.0 (Susceptibility, Infectivity and Recoverability Estimator) allows estimation of host genetic effects of a single nucleotide polymorphism (SNP), or other fixed effects (e.g. breed, vaccination status), for these three host traits using individual disease data typically available from field studies and challenge experiments. SIRE 1.0, however, lacks the capability to estimate genetic parameters for these traits in the likely case of underlying polygenic control. This paper introduces novel Bayesian methodology and a new software tool SIRE 2.0 for estimating polygenic contributions (i.e. variance components and additive genetic effects) for host susceptibility, infectivity and recoverability from temporal epidemic data, assuming that pedigree or genomic relationships are known. Analytical expressions for prediction accuracies (PAs) for these traits are derived for simplified scenarios, revealing their dependence on genetic and phenotypic variances, and the distribution of related individuals within and between contact groups. PAs for infectivity are found to be critically dependent on the size of contact groups. Validation of the methodology with data from simulated epidemics demonstrates good agreement between numerically generated PAs and analytical predictions. Genetic correlations between infectivity and other traits substantially increase trait PAs. Incomplete data (e.g. time censored or infrequent sampling) generally yield only small reductions in PAs, except for when infection times are completely unknown, which results in a substantial reduction. The method presented can estimate genetic parameters for host susceptibility, infectivity and recoverability from individual disease records. The freely available SIRE 2.0 software provides a valuable extension to SIRE 1.0 for estimating host polygenic effects underlying infectious disease transmission. This tool will open up new possibilities for analysis and quantification of genetic determinates of disease dynamics.","PeriodicalId":55120,"journal":{"name":"Genetics Selection Evolution","volume":"34 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143757995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Manu Kumar Gundappa, Diego Robledo, Alastair Hamilton, Ross D. Houston, James G. D. Prendergast, Daniel J. Macqueen
{"title":"High performance imputation of structural and single nucleotide variants using low-coverage whole genome sequencing","authors":"Manu Kumar Gundappa, Diego Robledo, Alastair Hamilton, Ross D. Houston, James G. D. Prendergast, Daniel J. Macqueen","doi":"10.1186/s12711-025-00962-6","DOIUrl":"https://doi.org/10.1186/s12711-025-00962-6","url":null,"abstract":"Whole genome sequencing (WGS), despite its advantages, is yet to replace methods for genotyping single nucleotide variants (SNVs) such as SNP arrays and targeted genotyping assays. Structural variants (SVs) have larger effects on traits than SNVs, but are more challenging to accurately genotype. Using low-coverage WGS with genotype imputation offers a cost-effective strategy to achieve genome-wide variant coverage, but is yet to be tested for SVs. Here, we investigate combined SNV and SV imputation with low-coverage WGS data in Atlantic salmon (Salmo salar). As the reference panel, we used genotypes for high-confidence SVs and SNVs for n = 365 wild individuals sampled from diverse populations. We also generated 15 × WGS data (n = 20 samples) for a commercial population external to the reference panel, and called SVs and SNVs with gold-standard approaches. An imputation method selected for its established performance using low-coverage sequencing data (GLIMPSE) was tested at WGS depths of 1 × , 2 × , 3 × , and 4 × for samples within and external to the reference panel. SNVs were imputed with high accuracy and recall across all WGS depths, including for samples out-with the reference panel. For SVs, we compared imputation based purely on linkage disequilibrium (LD) with SNVs, to that supplemented with SV genotype likelihoods (GLs) from low-coverage WGS. Including SV GLs increased imputation accuracy, but as a trade-off with recall, requiring 3–4 × depth for best performance. Combining strategies allowed us to capture 84% of the reference panel deletions with 87% accuracy at 1 × depth. We also show that SV length affects imputation performance, with provision of SV GLs greatly enhancing accuracy for the longest SVs in the dataset. This study highlights the promise of reference panel imputation using low-coverage WGS, including novel opportunities to enhance the resolution of genome-wide association studies by capturing SVs.","PeriodicalId":55120,"journal":{"name":"Genetics Selection Evolution","volume":"57 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2025-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143723122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multitrait genome-wide association best linear unbiased prediction of genetic values","authors":"Theo Meuwissen, Vinzent Boerner","doi":"10.1186/s12711-025-00964-4","DOIUrl":"https://doi.org/10.1186/s12711-025-00964-4","url":null,"abstract":"The GWABLUP (Genome-Wide Association based Best Linear Unbiased Prediction) approach used GWA analysis results to differentially weigh the SNPs in genomic prediction, and was found to improve the reliabilities of genomic predictions. However, the proposed multitrait GWABLUP method assumed that the SNP weights were the same across the traits. Here we extended and validated the multitrait GWABLUP method towards using trait specific SNP weights. In a 3-trait dairy data set, multitrait GWAS estimates of SNP effects and their standard errors were translated into trait specific likelihood ratios for the SNPs having trait effects, and posterior probabilities using the GWABLUP approach. This produced trait specific prior (co)variance matrices for each SNP, which were applied in a SNP-BLUP model for genomic predictions, implemented in the APEX linear model suite. In a validation population, the trait specific SNP weights resulted in more reliable predictions for all three traits. Especially, for somatic cell count, which was hardly related to the other traits, the use of the same weights across all traits was harming genomic predictions. The use of trait specific SNP weights overcame this problem. In multitrait GWABLUP analyses of ~ 30,000 reference population cows, trait specific SNP weights resulted in up to 13% more reliable genomic predictions than unweighted SNP-BLUP, and improved genomic predictions for all three studied traits.","PeriodicalId":55120,"journal":{"name":"Genetics Selection Evolution","volume":"61 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2025-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143665977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Didier Boichard, Sébastien Fritz, Pascal Croiseau, Vincent Ducrocq, Thierry Tribout, Beatriz C. D. Cuyabano
{"title":"Erosion of estimated genomic breeding values with generations is due to long distance associations between markers and QTL","authors":"Didier Boichard, Sébastien Fritz, Pascal Croiseau, Vincent Ducrocq, Thierry Tribout, Beatriz C. D. Cuyabano","doi":"10.1186/s12711-025-00963-5","DOIUrl":"https://doi.org/10.1186/s12711-025-00963-5","url":null,"abstract":"Most validation studies of genomic evaluations on candidates (prior to observing phenotypes) present inflation of their predicted breeding values, i.e., regression coefficients of their later observed phenotypes on the early predictions are smaller than one. The aim of this study was to show that this inflation pattern reflects at least partly long-distance associations between markers and quantitative trait loci (QTL) in the reference population and to propose methods to estimate the corresponding “erosion” coefficient. Across-chromosome linkage disequilibrium (LD) is observed in different dairy cattle breeds, being a result from limited effective population size and from relationships within the reference population. Due to this long distance LD, the estimated SNP effects capture non-zero contributions from distant QTLs, some located on other chromosomes than the SNP itself. Therefore, corresponding SNP effects are partly lost in the next generations and we refer to this loss as “erosion”. With the concept of QTL contribution to SNP effects derived from mixed model equations, we show with simulation that this long range LD explains 6–25% of the variance of the estimated genomic breeding values, a proportion that is unchanged when the evaluation model includes a residual polygenic effect. Two methods are proposed to predict this erosion factor assuming known simulated QTL effects. In Method 1, one generation of progeny is simulated from the reference population and the GEBV of these progeny based on SNP effects estimated in this newly simulated generation are regressed on the GEBV of the same progeny based on SNP effects estimated in the reference population. In Method 2 all the QTL contributions to SNP effects are regressed based on SNP-QTL recombination rates and summed to predict the GEBV at the next generation. The regression coefficient of the GEBV based on eroded contributions on the raw GEBV is also an estimate of erosion. An illustration is given with the French Normande female reference bovine population in 2021, showing erosion factors ranging from 0.84 to 0.87. Accounting for erosion is important to avoid inflation and biased predictions. The ways to both reduce inflation and to correct for it in the prediction are discussed.","PeriodicalId":55120,"journal":{"name":"Genetics Selection Evolution","volume":"1 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2025-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143666295","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Molecular breeding of pigs in the genome editing era","authors":"Jiahuan Chen, Jiaqi Wang, Haoran Zhao, Xiao Tan, Shihan Yan, Huanyu Zhang, Tiefeng Wang, Xiaochun Tang","doi":"10.1186/s12711-025-00961-7","DOIUrl":"https://doi.org/10.1186/s12711-025-00961-7","url":null,"abstract":"To address the increasing demand for high-quality pork protein, it is essential to implement strategies that enhance diets and produce pigs with excellent production traits. Selective breeding and crossbreeding are the primary methods used for genetic improvement in modern agriculture. However, these methods face challenges due to long breeding cycles and the necessity for beneficial genetic variation associated with high-quality traits within the population. This limitation restricts the transfer of desirable alleles across different genera and species. This article systematically reviews past and current research advancements in porcine molecular breeding. It discusses the screening of clustered regularly interspaced short palindromic repeats (CRISPR) to identify resistance loci in swine and the challenges and future applications of genetically modified pigs. The emergence of transgenic and gene editing technologies has prompted researchers to apply these methods to pig breeding. These advancements allow for alterations in the pig genome through various techniques, ranging from random integration into the genome to site-specific insertion and from target gene knockout (KO) to precise base and prime editing. As a result, numerous desirable traits, such as disease resistance, high meat yield, improved feed efficiency, reduced fat deposition, and lower environmental waste, can be achieved easily and effectively by genetic modification. These traits can serve as valuable resources to enhance swine breeding programmes. In the era of genome editing, molecular breeding of pigs is critical to the future of agriculture. Long-term and multidomain analyses of genetically modified pigs by researchers, related policy development by regulatory agencies, and public awareness and acceptance of their safety are the keys to realizing the transition of genetically modified products from the laboratory to the market.","PeriodicalId":55120,"journal":{"name":"Genetics Selection Evolution","volume":"19 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2025-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143582976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ruilin Su, Jingbo Lv, Yahui Xue, Sheng Jiang, Lei Zhou, Li Jiang, Junyan Tan, Zhencai Shen, Ping Zhong, Jianfeng Liu
{"title":"Genomic selection in pig breeding: comparative analysis of machine learning algorithms","authors":"Ruilin Su, Jingbo Lv, Yahui Xue, Sheng Jiang, Lei Zhou, Li Jiang, Junyan Tan, Zhencai Shen, Ping Zhong, Jianfeng Liu","doi":"10.1186/s12711-025-00957-3","DOIUrl":"https://doi.org/10.1186/s12711-025-00957-3","url":null,"abstract":"The effectiveness of genomic prediction (GP) significantly influences breeding progress, and employing SNP markers to predict phenotypic values is a pivotal aspect of pig breeding. Machine learning (ML) methods are usually used to predict phenotypic values since their advantages in processing high dimensional data. While, the existing researches have not indicated which ML methods are suitable for most pig genomic prediction. Therefore, it is necessary to select appropriate methods from a large number of ML methods as long as genomic prediction is performed. This paper compared the performance of popular ML methods in predicting pig phenotypes and then found out suitable methods for most traits. In this paper, five commonly used datasets from other literatures were utilized to compare the performance of different ML methods. The experimental results demonstrate that Stacking performs best on the PIC dataset where the trait information is hidden, and the performs of kernel ridge regression with rbf kernel (KRR-rbf) closely follows. Support vector regression (SVR) performs best in predicting reproductive traits, followed by genomic best linear unbiased prediction (GBLUP). GBLUP achieves the best performance on growth traits, with SVR as the second best. GBLUP achieves good performance for GP problems. Similarly, the Stacking, SVR, and KRR-RBF methods also achieve high prediction accuracy. Moreover, LR statistical analysis shows that Stacking, SVR and KRR are stable. When applying ML methods for phenotypic values prediction in pigs, we recommend these three approaches.","PeriodicalId":55120,"journal":{"name":"Genetics Selection Evolution","volume":"38 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2025-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143582977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}