Tatiana C de Souza, Luis F B Pinto, Valdecy A R da Cruz, Tatiane S Chud, Victor B Pedrosa, Gerson A O Junior, Hinayah R de Oliveira, Henrique A Mulim, Filippo Miglior, Flávio S Schenkel, Luiz F Brito
{"title":"基于不同软件和插补策略的荷斯坦牛X染色体变异基因型插补精度","authors":"Tatiana C de Souza, Luis F B Pinto, Valdecy A R da Cruz, Tatiane S Chud, Victor B Pedrosa, Gerson A O Junior, Hinayah R de Oliveira, Henrique A Mulim, Filippo Miglior, Flávio S Schenkel, Luiz F Brito","doi":"10.3168/jds.2025-26715","DOIUrl":null,"url":null,"abstract":"<p><p>The X chromosome is one of the largest in the cattle genome, but little is known about the imputation of X chromosome variants. Thus, the main objective of this study was to assess the imputation accuracy of SNPs located in the X chromosome based on different strategies. Data from 2,505 Holstein cattle were used, and the imputation was carried out in 2 steps. Step 1 consisted of imputation from 5 medium-density (MD) SNP panels to a consolidated MD SNP panel, and step 2 was based on imputation from this consolidated MD SNP panel to a high-density (HD) SNP panel. Six scenarios (S1-S6) were evaluated for imputing autosomal SNPs (S1<sup>A</sup>, S2<sup>A</sup>, S3<sup>A</sup>, S4<sup>A</sup>, S5<sup>A</sup>, S6<sup>A</sup>), as well as the entire X chromosome (S1<sup>X</sup>, S2<sup>X</sup>, S3<sup>X</sup>, S4<sup>X</sup>, S5<sup>X</sup>, S6<sup>X</sup>) and the pseudoautosomal region (PAR; S1<sup>PAR</sup>, S2<sup>PAR</sup>, S3<sup>PAR</sup>, S4<sup>PAR</sup>, S5<sup>PAR</sup>, S6<sup>PAR</sup>) and non-PAR (S1<sup>non-PAR</sup>, S2<sup>non-PAR</sup>, S3<sup>non-PAR</sup>, S4<sup>non-PAR</sup>, S5<sup>non-PAR</sup>, S6<sup>non-PAR</sup>) segments of the X chromosome. The validation population in all these scenarios had 169 females and zero (S1, S2, and S3) or 583 (S4, S5, and S6) males, whereas the reference population had 169 (S2, S5) or 392 (S1, S3, S4, S6) females and zero (S1, S4), 196 (S2, S5), or 1,361 (S3, S6) males. Two imputation software tools (Minimac and FindHap) were compared across scenarios. Step 1 provided a consolidated MD SNP panel containing 2,132 and 63,259 SNPs located on the X and autosomal chromosomes, respectively, and step 2 resulted in an HD SNP panel with 5,921 and 294,865 SNPs located on the X and autosomal chromosomes, respectively. In step 1, the lowest average allelic correlation (R) was 0.93 (S4<sup>PAR</sup>) with Minimac and 0.79 (S4<sup>PAR</sup>) with FindHap, whereas the lowest genotypic concordance rate (CR) was 95.0 (S4<sup>PAR</sup>) with Minimac and 85.0 (S4<sup>PAR</sup>) when using FindHap. In step 2, the lowest R was 0.93 (S4<sup>PAR</sup> and S4<sup>non-PAR</sup>) with Minimac and 0.66 (S4<sup>X</sup>) with FindHap, whereas the lowest CR was 96.2 (S4<sup>PAR</sup>) with Minimac and 80.3 (S4<sup>X</sup>) with FindHap. In general, all the scenarios had high imputation accuracy of the X chromosome SNPs when using the Minimac software, whereas FindHap showed better accuracy with scenarios S3 and S6. Including both males and females in the reference and validation populations increased the imputation accuracy of X chromosome variants. These findings highlight the importance of the choice of the imputation software and the need for enlarging the reference populations to increase genotype imputation accuracy of the X chromosome variants in Holstein cattle.</p>","PeriodicalId":354,"journal":{"name":"Journal of Dairy Science","volume":" ","pages":""},"PeriodicalIF":4.4000,"publicationDate":"2025-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Genotype imputation accuracy of X chromosome variants in Holstein cattle based on different software and imputation strategies.\",\"authors\":\"Tatiana C de Souza, Luis F B Pinto, Valdecy A R da Cruz, Tatiane S Chud, Victor B Pedrosa, Gerson A O Junior, Hinayah R de Oliveira, Henrique A Mulim, Filippo Miglior, Flávio S Schenkel, Luiz F Brito\",\"doi\":\"10.3168/jds.2025-26715\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>The X chromosome is one of the largest in the cattle genome, but little is known about the imputation of X chromosome variants. Thus, the main objective of this study was to assess the imputation accuracy of SNPs located in the X chromosome based on different strategies. Data from 2,505 Holstein cattle were used, and the imputation was carried out in 2 steps. Step 1 consisted of imputation from 5 medium-density (MD) SNP panels to a consolidated MD SNP panel, and step 2 was based on imputation from this consolidated MD SNP panel to a high-density (HD) SNP panel. Six scenarios (S1-S6) were evaluated for imputing autosomal SNPs (S1<sup>A</sup>, S2<sup>A</sup>, S3<sup>A</sup>, S4<sup>A</sup>, S5<sup>A</sup>, S6<sup>A</sup>), as well as the entire X chromosome (S1<sup>X</sup>, S2<sup>X</sup>, S3<sup>X</sup>, S4<sup>X</sup>, S5<sup>X</sup>, S6<sup>X</sup>) and the pseudoautosomal region (PAR; S1<sup>PAR</sup>, S2<sup>PAR</sup>, S3<sup>PAR</sup>, S4<sup>PAR</sup>, S5<sup>PAR</sup>, S6<sup>PAR</sup>) and non-PAR (S1<sup>non-PAR</sup>, S2<sup>non-PAR</sup>, S3<sup>non-PAR</sup>, S4<sup>non-PAR</sup>, S5<sup>non-PAR</sup>, S6<sup>non-PAR</sup>) segments of the X chromosome. The validation population in all these scenarios had 169 females and zero (S1, S2, and S3) or 583 (S4, S5, and S6) males, whereas the reference population had 169 (S2, S5) or 392 (S1, S3, S4, S6) females and zero (S1, S4), 196 (S2, S5), or 1,361 (S3, S6) males. Two imputation software tools (Minimac and FindHap) were compared across scenarios. Step 1 provided a consolidated MD SNP panel containing 2,132 and 63,259 SNPs located on the X and autosomal chromosomes, respectively, and step 2 resulted in an HD SNP panel with 5,921 and 294,865 SNPs located on the X and autosomal chromosomes, respectively. In step 1, the lowest average allelic correlation (R) was 0.93 (S4<sup>PAR</sup>) with Minimac and 0.79 (S4<sup>PAR</sup>) with FindHap, whereas the lowest genotypic concordance rate (CR) was 95.0 (S4<sup>PAR</sup>) with Minimac and 85.0 (S4<sup>PAR</sup>) when using FindHap. In step 2, the lowest R was 0.93 (S4<sup>PAR</sup> and S4<sup>non-PAR</sup>) with Minimac and 0.66 (S4<sup>X</sup>) with FindHap, whereas the lowest CR was 96.2 (S4<sup>PAR</sup>) with Minimac and 80.3 (S4<sup>X</sup>) with FindHap. In general, all the scenarios had high imputation accuracy of the X chromosome SNPs when using the Minimac software, whereas FindHap showed better accuracy with scenarios S3 and S6. Including both males and females in the reference and validation populations increased the imputation accuracy of X chromosome variants. These findings highlight the importance of the choice of the imputation software and the need for enlarging the reference populations to increase genotype imputation accuracy of the X chromosome variants in Holstein cattle.</p>\",\"PeriodicalId\":354,\"journal\":{\"name\":\"Journal of Dairy Science\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":4.4000,\"publicationDate\":\"2025-09-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Dairy Science\",\"FirstCategoryId\":\"97\",\"ListUrlMain\":\"https://doi.org/10.3168/jds.2025-26715\",\"RegionNum\":1,\"RegionCategory\":\"农林科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AGRICULTURE, DAIRY & ANIMAL SCIENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Dairy Science","FirstCategoryId":"97","ListUrlMain":"https://doi.org/10.3168/jds.2025-26715","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURE, DAIRY & ANIMAL SCIENCE","Score":null,"Total":0}
Genotype imputation accuracy of X chromosome variants in Holstein cattle based on different software and imputation strategies.
The X chromosome is one of the largest in the cattle genome, but little is known about the imputation of X chromosome variants. Thus, the main objective of this study was to assess the imputation accuracy of SNPs located in the X chromosome based on different strategies. Data from 2,505 Holstein cattle were used, and the imputation was carried out in 2 steps. Step 1 consisted of imputation from 5 medium-density (MD) SNP panels to a consolidated MD SNP panel, and step 2 was based on imputation from this consolidated MD SNP panel to a high-density (HD) SNP panel. Six scenarios (S1-S6) were evaluated for imputing autosomal SNPs (S1A, S2A, S3A, S4A, S5A, S6A), as well as the entire X chromosome (S1X, S2X, S3X, S4X, S5X, S6X) and the pseudoautosomal region (PAR; S1PAR, S2PAR, S3PAR, S4PAR, S5PAR, S6PAR) and non-PAR (S1non-PAR, S2non-PAR, S3non-PAR, S4non-PAR, S5non-PAR, S6non-PAR) segments of the X chromosome. The validation population in all these scenarios had 169 females and zero (S1, S2, and S3) or 583 (S4, S5, and S6) males, whereas the reference population had 169 (S2, S5) or 392 (S1, S3, S4, S6) females and zero (S1, S4), 196 (S2, S5), or 1,361 (S3, S6) males. Two imputation software tools (Minimac and FindHap) were compared across scenarios. Step 1 provided a consolidated MD SNP panel containing 2,132 and 63,259 SNPs located on the X and autosomal chromosomes, respectively, and step 2 resulted in an HD SNP panel with 5,921 and 294,865 SNPs located on the X and autosomal chromosomes, respectively. In step 1, the lowest average allelic correlation (R) was 0.93 (S4PAR) with Minimac and 0.79 (S4PAR) with FindHap, whereas the lowest genotypic concordance rate (CR) was 95.0 (S4PAR) with Minimac and 85.0 (S4PAR) when using FindHap. In step 2, the lowest R was 0.93 (S4PAR and S4non-PAR) with Minimac and 0.66 (S4X) with FindHap, whereas the lowest CR was 96.2 (S4PAR) with Minimac and 80.3 (S4X) with FindHap. In general, all the scenarios had high imputation accuracy of the X chromosome SNPs when using the Minimac software, whereas FindHap showed better accuracy with scenarios S3 and S6. Including both males and females in the reference and validation populations increased the imputation accuracy of X chromosome variants. These findings highlight the importance of the choice of the imputation software and the need for enlarging the reference populations to increase genotype imputation accuracy of the X chromosome variants in Holstein cattle.
期刊介绍:
The official journal of the American Dairy Science Association®, Journal of Dairy Science® (JDS) is the leading peer-reviewed general dairy research journal in the world. JDS readers represent education, industry, and government agencies in more than 70 countries with interests in biochemistry, breeding, economics, engineering, environment, food science, genetics, microbiology, nutrition, pathology, physiology, processing, public health, quality assurance, and sanitation.