Samuel J Widmayer, Lydia K Wooldridge, Emily Swanzey, Mary Barter, Chrystal Snow, Michael Saul, Qingchang Meng, Beth Dumont, Laura Reinholdt, Daniel M Gatti
{"title":"低覆盖率全基因组测序有助于在复杂的小鼠杂交中精确和经济地重建单倍型。","authors":"Samuel J Widmayer, Lydia K Wooldridge, Emily Swanzey, Mary Barter, Chrystal Snow, Michael Saul, Qingchang Meng, Beth Dumont, Laura Reinholdt, Daniel M Gatti","doi":"10.1007/s00335-025-10148-6","DOIUrl":null,"url":null,"abstract":"<p><p>The search for the underlying genetic contributions to complex traits and diseases relies on accurate genetic data from populations of interest. Outbred populations, like the Diversity Outbred (DO), are commonly genotyped using commercial SNP arrays, such as the Giga Mouse Universal Genotyping Array (GigaMUGA). However, array genotypes are expensive to collect, subject to significant ascertainment bias, and too sparse to capture the genetic structure of highly recombined mouse crosses. We investigated the efficacy of sequencing-based genotyping by comparing genotyping results between the GigaMUGA, double-digest restriction-site associated DNA sequencing (ddRADseq), and low-coverage whole-genome sequencing (lcWGS). We aligned reads at ~ 1× coverage and imputed segregating SNPs from the eight DO founder strains onto 48 DO genomes and reconstructed their haplotypes using R/qtl2. Haplotype reconstructions derived from all three methods were highly concordant. However, lcWGS more faithfully recapitulated crossover counts and identified more small (< 1 Mb) haplotype blocks at as low as 0.1× coverage. Over 90% of local expression quantitative trait loci identified in a set of 183 DO-derived embryoid bodies using the GigaMUGA were recalled by lcWGS at coverages as low as 0.1×. We recommend that lcWGS be adopted as the primary method of genotyping complex crosses, and cell-based resources derived from them because they are as accurate as array-based reconstructions, robust to ultra-low sequencing depths, may more accurately model haplotypes of the mouse genome that are difficult to resolve with dense reference data, and cost-effective.</p>","PeriodicalId":18259,"journal":{"name":"Mammalian Genome","volume":" ","pages":""},"PeriodicalIF":2.7000,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Low-coverage whole-genome sequencing facilitates accurate and cost-effective haplotype reconstruction in complex mouse crosses.\",\"authors\":\"Samuel J Widmayer, Lydia K Wooldridge, Emily Swanzey, Mary Barter, Chrystal Snow, Michael Saul, Qingchang Meng, Beth Dumont, Laura Reinholdt, Daniel M Gatti\",\"doi\":\"10.1007/s00335-025-10148-6\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>The search for the underlying genetic contributions to complex traits and diseases relies on accurate genetic data from populations of interest. Outbred populations, like the Diversity Outbred (DO), are commonly genotyped using commercial SNP arrays, such as the Giga Mouse Universal Genotyping Array (GigaMUGA). However, array genotypes are expensive to collect, subject to significant ascertainment bias, and too sparse to capture the genetic structure of highly recombined mouse crosses. We investigated the efficacy of sequencing-based genotyping by comparing genotyping results between the GigaMUGA, double-digest restriction-site associated DNA sequencing (ddRADseq), and low-coverage whole-genome sequencing (lcWGS). We aligned reads at ~ 1× coverage and imputed segregating SNPs from the eight DO founder strains onto 48 DO genomes and reconstructed their haplotypes using R/qtl2. Haplotype reconstructions derived from all three methods were highly concordant. However, lcWGS more faithfully recapitulated crossover counts and identified more small (< 1 Mb) haplotype blocks at as low as 0.1× coverage. Over 90% of local expression quantitative trait loci identified in a set of 183 DO-derived embryoid bodies using the GigaMUGA were recalled by lcWGS at coverages as low as 0.1×. We recommend that lcWGS be adopted as the primary method of genotyping complex crosses, and cell-based resources derived from them because they are as accurate as array-based reconstructions, robust to ultra-low sequencing depths, may more accurately model haplotypes of the mouse genome that are difficult to resolve with dense reference data, and cost-effective.</p>\",\"PeriodicalId\":18259,\"journal\":{\"name\":\"Mammalian Genome\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2025-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Mammalian Genome\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1007/s00335-025-10148-6\",\"RegionNum\":4,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"BIOCHEMISTRY & MOLECULAR BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Mammalian Genome","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1007/s00335-025-10148-6","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
Low-coverage whole-genome sequencing facilitates accurate and cost-effective haplotype reconstruction in complex mouse crosses.
The search for the underlying genetic contributions to complex traits and diseases relies on accurate genetic data from populations of interest. Outbred populations, like the Diversity Outbred (DO), are commonly genotyped using commercial SNP arrays, such as the Giga Mouse Universal Genotyping Array (GigaMUGA). However, array genotypes are expensive to collect, subject to significant ascertainment bias, and too sparse to capture the genetic structure of highly recombined mouse crosses. We investigated the efficacy of sequencing-based genotyping by comparing genotyping results between the GigaMUGA, double-digest restriction-site associated DNA sequencing (ddRADseq), and low-coverage whole-genome sequencing (lcWGS). We aligned reads at ~ 1× coverage and imputed segregating SNPs from the eight DO founder strains onto 48 DO genomes and reconstructed their haplotypes using R/qtl2. Haplotype reconstructions derived from all three methods were highly concordant. However, lcWGS more faithfully recapitulated crossover counts and identified more small (< 1 Mb) haplotype blocks at as low as 0.1× coverage. Over 90% of local expression quantitative trait loci identified in a set of 183 DO-derived embryoid bodies using the GigaMUGA were recalled by lcWGS at coverages as low as 0.1×. We recommend that lcWGS be adopted as the primary method of genotyping complex crosses, and cell-based resources derived from them because they are as accurate as array-based reconstructions, robust to ultra-low sequencing depths, may more accurately model haplotypes of the mouse genome that are difficult to resolve with dense reference data, and cost-effective.
期刊介绍:
Mammalian Genome focuses on the experimental, theoretical and technical aspects of genetics, genomics, epigenetics and systems biology in mouse, human and other mammalian species, with an emphasis on the relationship between genotype and phenotype, elucidation of biological and disease pathways as well as experimental aspects of interventions, therapeutics, and precision medicine. The journal aims to publish high quality original papers that present novel findings in all areas of mammalian genetic research as well as review articles on areas of topical interest. The journal will also feature commentaries and editorials to inform readers of breakthrough discoveries as well as issues of research standards, policies and ethics.