{"title":"Mapping-based genome size estimation.","authors":"Shakunthala Natarajan, Jessica Gehrke, Boas Pucker","doi":"10.1186/s12864-025-11640-8","DOIUrl":null,"url":null,"abstract":"<p><p>While the size of chromosomes can be measured under a microscope, obtaining the exact size of a genome remains a challenge. Biochemical methods and k-mer distribution-based approaches allow only estimations. An alternative approach to estimate the genome size based on high contiguity assemblies and read mappings is presented here. Analyses of Arabidopsis thaliana and Beta vulgaris data sets are presented to show the impact of different parameters. Oryza sativa, Brachypodium distachyon, Solanum lycopersicum, Vitis vinifera, and Zea mays were also analyzed to demonstrate the broad applicability of this approach. Further, MGSE was also used to analyze Escherichia coli, Saccharomyces cerevisiae, and Caenorhabditis elegans datasets to show its utility beyond plants. Mapping-based Genome Size Estimation (MGSE) and additional scripts are available on GitHub: https://github.com/bpucker/MGSE . MGSE predicts genome sizes based on short reads or long reads requiring a minimal coverage of 5-fold.</p>","PeriodicalId":9030,"journal":{"name":"BMC Genomics","volume":"26 1","pages":"482"},"PeriodicalIF":3.5000,"publicationDate":"2025-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12079912/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Genomics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s12864-025-11640-8","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOTECHNOLOGY & APPLIED MICROBIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
While the size of chromosomes can be measured under a microscope, obtaining the exact size of a genome remains a challenge. Biochemical methods and k-mer distribution-based approaches allow only estimations. An alternative approach to estimate the genome size based on high contiguity assemblies and read mappings is presented here. Analyses of Arabidopsis thaliana and Beta vulgaris data sets are presented to show the impact of different parameters. Oryza sativa, Brachypodium distachyon, Solanum lycopersicum, Vitis vinifera, and Zea mays were also analyzed to demonstrate the broad applicability of this approach. Further, MGSE was also used to analyze Escherichia coli, Saccharomyces cerevisiae, and Caenorhabditis elegans datasets to show its utility beyond plants. Mapping-based Genome Size Estimation (MGSE) and additional scripts are available on GitHub: https://github.com/bpucker/MGSE . MGSE predicts genome sizes based on short reads or long reads requiring a minimal coverage of 5-fold.
期刊介绍:
BMC Genomics is an open access, peer-reviewed journal that considers articles on all aspects of genome-scale analysis, functional genomics, and proteomics.
BMC Genomics is part of the BMC series which publishes subject-specific journals focused on the needs of individual research communities across all areas of biology and medicine. We offer an efficient, fair and friendly peer review service, and are committed to publishing all sound science, provided that there is some advance in knowledge presented by the work.