Alexandre Gilardet, Edana Lord, Gonzalo Oteo García, Georgios Xenikoudakis, Katerina Douka, Matthew J Wooller, Timothy Rowe, Michael D Martin, Mathilde Le Moullec, Michail Anisimov, Peter D Heintzman, Love Dalén
{"title":"A High-Throughput Ancient DNA Extraction Method for Large-Scale Sample Screening.","authors":"Alexandre Gilardet, Edana Lord, Gonzalo Oteo García, Georgios Xenikoudakis, Katerina Douka, Matthew J Wooller, Timothy Rowe, Michael D Martin, Mathilde Le Moullec, Michail Anisimov, Peter D Heintzman, Love Dalén","doi":"10.1111/1755-0998.14077","DOIUrl":"https://doi.org/10.1111/1755-0998.14077","url":null,"abstract":"<p><p>Large-scale DNA screening of palaeontological and archaeological collections remains a limiting and costly factor for ancient DNA studies. Several DNA extraction protocols are routinely used in ancient DNA laboratories and have even been automated on robotic platforms. Robots offer a solution for high-throughput screening but the costs, as well as necessity for trained technicians and engineers, can be prohibitive for some laboratories. Here, we present a high-throughput alternative to robot-based ancient DNA extraction using a 96-column plate. When compared to routine single MinElute columns, we retrieved highly similar endogenous DNA contents, an important metric in ancient DNA screening. Mitogenomes with a coverage depth greater than 0.1× could be generated and allowed for taxonomic assignment. However, average fragment lengths, DNA damage and library complexities significantly differed between methods but these differences became nonsignificant after modification of our library purification protocol. Our high-throughput extraction method allows generation of 96 extracts within approximately 4 hours of laboratory work while bringing the cost down by ~39% compared to using single columns. Additionally, we formally demonstrate that the addition of Tween-20 during the elution step results in higher complexity libraries, thereby enabling higher genome coverage for the same sequencing effort.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":" ","pages":"e14077"},"PeriodicalIF":5.5,"publicationDate":"2025-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143253979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Matthew I M Pinder, Björn Andersson, Hannah Blossom, Marie Svensson, Karin Rengefors, Mats Töpel
{"title":"Bamboozle: A Bioinformatic Tool for Identification and Quantification of Intraspecific Barcodes.","authors":"Matthew I M Pinder, Björn Andersson, Hannah Blossom, Marie Svensson, Karin Rengefors, Mats Töpel","doi":"10.1111/1755-0998.14067","DOIUrl":"https://doi.org/10.1111/1755-0998.14067","url":null,"abstract":"<p><p>Evolutionary changes in populations of microbes, such as microalgae, cannot be traced using conventional metabarcoding loci as they lack intraspecific resolution. Consequently, selection and competition processes among strains of the same species cannot be resolved without elaborate isolation, culturing, and genotyping efforts. Bamboozle, a new bioinformatic tool introduced here, scans the entire genome of a species and identifies allele-rich barcodes that enable direct identification of different genetic strains from a population using amplicon sequencing of a single DNA sample. We demonstrate its usefulness by identifying hypervariable barcoding loci (< 500 bp) from genomic data in two microalgal species, the diploid diatom Skeletonema marinoi and the haploid chlorophyte Chlamydomonas reinhardtii. Across the two genomes, four and twenty-two loci, respectively, were identified that could in silico resolve all analysed genotypes. All of the identified loci are within protein-coding genes with various metabolic functions. Single nucleotide polymorphisms (SNPs) provided the most reliable genetic markers, and among 54 strains of S. marinoi, three 500 bp loci contained, on average, 46 SNPs, 103 strain-specific alleles, and displayed 100% heterozygosity. This high level of heterozygosity was identified as a novel opportunity to improve strain quantification and detect false positive artefacts during denoising of amplicon sequences. Finally, we illustrate how metabarcoding of a single genetic locus can be used to track abundances of S. marinoi strains in an artificial selection experiment. As future genomic datasets become available and DNA sequencing technologies develop, Bamboozle has flexible user settings enabling optimal barcodes to be designed for other species and applications.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":" ","pages":"e14067"},"PeriodicalIF":5.5,"publicationDate":"2025-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143187519","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Emily C Giles, Vanessa L González, Paulina Carimán, Carlos Leiva, Ana Victoria Suescún, Sarah Lemer, Marie Laure Guillemin, Daniel Ortiz-Barrientos, Pablo Saenz-Agudelo
{"title":"Comparative Genomics Points to Ecological Drivers of Genomic Divergence Among Intertidal Limpets.","authors":"Emily C Giles, Vanessa L González, Paulina Carimán, Carlos Leiva, Ana Victoria Suescún, Sarah Lemer, Marie Laure Guillemin, Daniel Ortiz-Barrientos, Pablo Saenz-Agudelo","doi":"10.1111/1755-0998.14075","DOIUrl":"https://doi.org/10.1111/1755-0998.14075","url":null,"abstract":"<p><p>Comparative genomic studies of closely related taxa are important for our understanding of the causes of divergence on a changing Earth. This being said, the genomic resources available for marine intertidal molluscs are limited and currently, there are few publicly available high-quality annotated genomes for intertidal species and for molluscs in general. Here we report transcriptome assemblies for six species of Patellogastropoda and genome assemblies and annotations for three of these species (Scurria scurra, Scurria viridula and Scurria zebrina). Comparative analysis using these genomic resources suggest that and recently diverging lineages (10-20 Mya) have experienced similar amounts of contractions and expansions but across different gene families. Furthermore, differences among recently diverged species are reflected in variation in the amount of coding and noncoding material in genomes, such as amount of repetitive elements and lengths of transcripts and introns and exons. Additionally, functional ontologies of species-specific and duplicated genes together with demographic inference support the finding that recent divergence among members of the genus Scurria aligns with their unique ecological characteristics. Overall, the resources presented here will be valuable for future studies of adaptation in molluscs and in intertidal habitats as a whole.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":" ","pages":"e14075"},"PeriodicalIF":5.5,"publicationDate":"2025-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143062764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Katherine A Solari, Shakeel Ahmad, Ellie E Armstrong, Michael G Campana, Hussain Ali, Shoaib Hameed, Jami Ullah, Barkat Ullah Khan, Muhammad A Nawaz, Dmitri A Petrov
{"title":"Next-Generation Snow Leopard Population Assessment Tool: Multiplex-PCR SNP Panel for Individual Identification From Faeces.","authors":"Katherine A Solari, Shakeel Ahmad, Ellie E Armstrong, Michael G Campana, Hussain Ali, Shoaib Hameed, Jami Ullah, Barkat Ullah Khan, Muhammad A Nawaz, Dmitri A Petrov","doi":"10.1111/1755-0998.14074","DOIUrl":"https://doi.org/10.1111/1755-0998.14074","url":null,"abstract":"<p><p>In recent years, numerous single nucleotide polymorphism (SNP) panel methods to genotype non-invasive faecal samples have been developed. However, none of these existing methods fit all of the criteria necessary to make a SNP panel broadly usable for conservation projects in any country-cost effective, streamlined lab protocol and user-friendly open-source bioinformatics protocols for panel design and analysis. Here, we present such a method and display its utility by developing a multiplex PCR SNP panel for conducting individual ID of snow leopards, Panthera uncia, from faecal samples. The SNP panel we present consists of 144 SNPs and utilises next-generation sequencing technology. We validate our SNP panel with paired tissue and faecal samples from zoo individuals, showing a minimum of 96.7% accuracy in allele calls per run. We then generate SNP data from 235 field-collected faecal samples from across Pakistan to show that the panel can reliably identify individuals from low-quality faecal samples of unknown age and is robust to contamination. We also show that our SNP panel has the capability to identify first-order relatives among sampled zoo individuals and provides insights into the geographic origin of samples. This SNP panel will empower the snow leopard research community in their efforts to assess local and global snow leopard population sizes. More broadly, we present a SNP panel development method that can be used for any species of interest for which adequate genomic reference data is available.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":" ","pages":"e14074"},"PeriodicalIF":5.5,"publicationDate":"2025-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143062706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nauras Daraghmeh, Katrina Exter, Justine Pagnier, Piotr Balazy, Ibon Cancio, Giorgos Chatzigeorgiou, Eva Chatzinikolaou, Maciej Chelchowski, Nathan Alexis Mitchell Chrismas, Thierry Comtet, Thanos Dailianis, Klaas Deneudt, Oihane Diaz de Cerio, Markos Digenis, Vasilis Gerovasileiou, José González, Laura Kauppi, Jon Bent Kristoffersen, Piotr Kukliński, Rafał Lasota, Liraz Levy, Magdalena Małachowicz, Borut Mavrič, Jonas Mortelmans, Estefania Paredes, Anita Poćwierz-Kotus, Henning Reiss, Ioulia Santi, Georgia Sarafidou, Grigorios Skouradakis, Jostein Solbakken, Peter A U Staehr, Javier Tajadura, Jakob Thyrring, Jesus S Troncoso, Emmanouela Vernadou, Frederique Viard, Haris Zafeiropoulos, Małgorzata Zbawicka, Christina Pavloudi, Matthias Obst
{"title":"A Long-Term Ecological Research Data Set From the Marine Genetic Monitoring Program ARMS-MBON 2018-2020.","authors":"Nauras Daraghmeh, Katrina Exter, Justine Pagnier, Piotr Balazy, Ibon Cancio, Giorgos Chatzigeorgiou, Eva Chatzinikolaou, Maciej Chelchowski, Nathan Alexis Mitchell Chrismas, Thierry Comtet, Thanos Dailianis, Klaas Deneudt, Oihane Diaz de Cerio, Markos Digenis, Vasilis Gerovasileiou, José González, Laura Kauppi, Jon Bent Kristoffersen, Piotr Kukliński, Rafał Lasota, Liraz Levy, Magdalena Małachowicz, Borut Mavrič, Jonas Mortelmans, Estefania Paredes, Anita Poćwierz-Kotus, Henning Reiss, Ioulia Santi, Georgia Sarafidou, Grigorios Skouradakis, Jostein Solbakken, Peter A U Staehr, Javier Tajadura, Jakob Thyrring, Jesus S Troncoso, Emmanouela Vernadou, Frederique Viard, Haris Zafeiropoulos, Małgorzata Zbawicka, Christina Pavloudi, Matthias Obst","doi":"10.1111/1755-0998.14073","DOIUrl":"https://doi.org/10.1111/1755-0998.14073","url":null,"abstract":"<p><p>Molecular methods such as DNA/eDNA metabarcoding have emerged as useful tools to document the biodiversity of complex communities over large spatio-temporal scales. We established an international Marine Biodiversity Observation Network (ARMS-MBON) combining standardised sampling using autonomous reef monitoring structures (ARMS) with metabarcoding for genetic monitoring of marine hard-bottom benthic communities. Here, we present the data of our first sampling campaign comprising 56 ARMS units deployed in 2018-2019 and retrieved in 2018-2020 across 15 observatories along the coasts of Europe and adjacent regions. We describe the open-access data set (image, genetic and metadata) and explore the genetic data to show its potential for marine biodiversity monitoring and ecological research. Our analysis shows that ARMS recovered more than 60 eukaryotic phyla capturing diversity of up to ~5500 amplicon sequence variants and ~1800 operational taxonomic units, and up to ~250 and ~50 species per observatory using the cytochrome c oxidase subunit I (COI) and 18S rRNA marker genes, respectively. Further, ARMS detected threatened, vulnerable and non-indigenous species often targeted in biological monitoring. We show that while deployment duration does not drive diversity estimates, sampling effort and sequencing depth across observatories do. We recommend that ARMS should be deployed for at least 3-6 months during the main growth season to use resources as efficiently as possible and that post-sequencing curation is applied to enable statistical comparison of spatio-temporal entities. We suggest that ARMS should be used in biological monitoring programs and long-term ecological research and encourage the adoption of our ARMS-MBON protocols.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":" ","pages":"e14073"},"PeriodicalIF":5.5,"publicationDate":"2025-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143062749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Quantifying Bone Collagen Fingerprint Variation Between Species.","authors":"Andrew Baker, Michael Buckley","doi":"10.1111/1755-0998.14072","DOIUrl":"https://doi.org/10.1111/1755-0998.14072","url":null,"abstract":"<p><p>Collagen is the most ubiquitous protein in the animal kingdom and one of the most abundant proteins on Earth. Despite having a relatively repetitive amino acid sequence motif that enables its triple helical structure, in type 1 collagen, that dominates skin and bone, there is enough variation for its increasing use for the biomolecular species identification of animal tissues processed or degraded beyond the amenability of DNA-based analyses. In recent years, this has been most commonly achieved through the technique of collagen peptide mass fingerprinting (PMF) known as ZooMS (Zooarchaeology by Mass Spectrometry), applied to the analysis of tens of thousands of samples across over one hundred studies in the past decade alone. However, a robust means to quantify variation between these fingerprints remains elusive, despite being increasingly required due to the shift towards a wider range of wild fauna and those that are more distantly related from currently known sequences. This is particularly problematic in fish due to their greater sequence variation. Here we evaluate the quantification of the relative closeness of collagen fingerprints between families using ANOSIM and a modified SIMPER analysis, incorporating relative peak intensity. Our results show a clear correlation between sequence differentiation and statistical distance of PMFs, indicating that the additional complexity of type 1 collagen in fish could directly affect the efficacy of biomolecular techniques such as ZooMS. Furthermore, this multivariate statistical analysis demonstrates that PMFs in fish are substantively more distinct than those of mammalian or amphibian taxa.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":" ","pages":"e14072"},"PeriodicalIF":5.5,"publicationDate":"2025-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143057608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zachary P. Cohen, Lindsey C. Perkin, Tyler J. Raszick, Sheina B. Sim, Scott M. Geib, Anna K. Childers, Gregory A. Sword, Charles P.-C. Suh
{"title":"Pangenomics Links Boll Weevil Divergence With Ancient Mesoamerican Cotton Cultivation","authors":"Zachary P. Cohen, Lindsey C. Perkin, Tyler J. Raszick, Sheina B. Sim, Scott M. Geib, Anna K. Childers, Gregory A. Sword, Charles P.-C. Suh","doi":"10.1111/1755-0998.14054","DOIUrl":"10.1111/1755-0998.14054","url":null,"abstract":"<div>\u0000 \u0000 <p>The boll weevil, <i>Anthonomus grandis grandis</i> Boheman, and thurberia weevil, <i>Anthonomus grandis thurberiae</i> Pierce, together comprise a species complex that ranges throughout Mexico, the southwestern regions of the United States and parts of South America. The boll weevil is a historically damaging and contemporaneously threatening pest to commercial upland cotton, <i>Gossypium hirsutum</i> L. (Malvales: Malvaceae), whereas the thurberia weevil is regarded as an innocuous non-pest subspecies that is mostly found on non-cultivated Thurber's or Arizona cotton, <i>Gossypium thurberi</i> L., throughout its native range in western Mexico and the southwestern United States. Recent independent analyses, using mitochondrial and whole-genome markers, have suggested the independent evolution of these lineages is more attributable to geographic isolation than biotic factors. We suggest a combination of drivers after employing comparative genomic, population genetic and pangenome methodologies to identify large and small polymorphisms. By leveraging genetic differences, we determined 39,310 diagnostic loci between the subspecies, find genes under selection, and model the subspecies' shared and unique evolutionary history. Interestingly, structural variations capture a large proportion of genes at the population level and demographic reconstruction suggests a split between approximately 3,320–16,300 before present (YBP), which coincides with cotton cultivation in Mesoamerica, approximately 3,000-5,000 YBP. Observed polymorphisms are enriched for reproductive, regulatory, and metabolic genes, which may be attributed to the subspecies split and coevolution with cultivated cotton. Our results demonstrate the utility of a holistic, comparative framework utilising small and large polymorphisms to reconstruct demography and identify genetic novelty via pangenomics.</p>\u0000 </div>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"25 3","pages":""},"PeriodicalIF":5.5,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142997006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Joana Mariz, Ali Nawaz, Yvonne Bösch, Christian Wurzbacher
{"title":"Exploring Environmental Microfungal Diversity Through Serial Single Cell Screening","authors":"Joana Mariz, Ali Nawaz, Yvonne Bösch, Christian Wurzbacher","doi":"10.1111/1755-0998.14055","DOIUrl":"10.1111/1755-0998.14055","url":null,"abstract":"<p>Known for its remarkable diversity and ecological importance, the fungal kingdom remains largely unexplored. In fact, the number of unknown and undescribed fungi is predicted to exceed the number of known fungal species by far. Despite efforts to uncover these dark fungal taxa, we still face inherent sampling biases and methodological limitations. Here, we present a framework that combines taxonomic knowledge, molecular biology and data processing to explore the fungal biodiversity of enigmatic aquatic fungal lineages. Our work is based on serial screening of environmental fungal cells to approach unknown fungal taxa. Microscopic documentation is followed by DNA analysis of laser micro-dissected cells, coupled with a ribosomal operon barcoding step realised by long-read sequencing, followed by an optional whole genome sequencing step. We tested this approach on a range of aquatic fungal cells mostly belonging to the ecological group of aquatic hyphomycetes derived from environmental samples. From this initial screening, we were able to identify 60 potentially new fungal taxa in the target dataset. By extending this methodology to other fungal lineages associated with different habitats, we expect to increasingly characterise the molecular barcodes of dark fungal taxa in diverse environmental samples. This work offers a promising solution to the challenges posed by unknown and unculturable fungi and holds the potential to be applied to the diverse lineages of undescribed microeukaryotes.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"25 3","pages":""},"PeriodicalIF":5.5,"publicationDate":"2025-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.14055","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142996936","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Genomic and Methylomic Signatures Associated With the Maintenance of Genome Stability and Adaptive Evolution in Two Closely Allied Wolf Spiders.","authors":"Qing Zuo, Run-Biao Wu, Li-Na Sun, Tian-Yu Ren, Zheng Fan, Lu-Yu Wang, Bing Tan, Bin Luo, Muhammad Irfan, Qian Huang, Yan-Jun Shen, Zhi-Sheng Zhang","doi":"10.1111/1755-0998.14071","DOIUrl":"https://doi.org/10.1111/1755-0998.14071","url":null,"abstract":"<p><p>Pardosa spiders, belonging to the wolf spider family Lycosidae, play a vital role in maintaining the health of forest and agricultural ecosystems due to their function in pest control. This study presents chromosome-level genome assemblies for two allied Pardosa species, P. laura and P. agraria. Both species' genomes show a notable expansion of helitron transposable elements, which contributes to their large genome sizes. Methylome analysis indicates that P. laura has higher overall DNA methylation levels compared to P. agraria. DNA methylation may not only aids in transposable element-driven genome expansion but also positively affects the three-dimensional organisation of P. laura after transposon amplification, thereby potentially enhancing genome stability. Genes associated with hyper-differentially methylated regions in P. laura (compared to P. agraria) are enriched in functions related to mRNA processing and energy production. Furthermore, combined transcriptome and methylome profiling has uncovered a complex regulatory interplay between DNA methylation and gene expression, emphasising the important role of gene body methylation in the regulation of gene expression. Comparative genomic analysis shows a significant expansion of cuticle protein and detoxification-related gene families in P. laura, which may improve its adaptability to various habitats. This study provides essential genomic and methylomic insights, offering a deeper understanding of the relationship between transposable elements and genome stability, and illuminating the adaptive evolution and species differentiation among allied spiders.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":" ","pages":"e14071"},"PeriodicalIF":5.5,"publicationDate":"2025-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142996949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Eric C Anderson, Rachael M Giglio, Matthew G DeSaix, Timothy J Smyser
{"title":"gscramble: Simulation of Admixed Individuals Without Reuse of Genetic Material.","authors":"Eric C Anderson, Rachael M Giglio, Matthew G DeSaix, Timothy J Smyser","doi":"10.1111/1755-0998.14069","DOIUrl":"https://doi.org/10.1111/1755-0998.14069","url":null,"abstract":"<p><p>While a best practice for evaluating the behaviour of genetic clustering algorithms on empirical data is to conduct parallel analyses on simulated data, these types of simulation techniques often involve sampling genetic data with replacement. In this paper we demonstrate that sampling with replacement, especially with large marker sets, inflates the perceived statistical power to correctly assign individuals (or the alleles that they carry) back to source populations-a phenomenon we refer to as resampling-induced, spurious power inflation (RISPI). To address this issue, we present gscramble, a simulation approach in R for creating biologically informed individual genotypes from empirical data that: (1) samples alleles from populations without replacement and (2) segregates alleles based on species-specific recombination rates. This framework makes it possible to simulate admixed individuals in a way that respects the physical linkage between markers on the same chromosome and which does not suffer from RISPI. This is achieved in gscramble by allowing users to specify pedigrees of varying complexity in order to simulate admixed genotypes, segregating and tracking haplotype blocks from different source populations through those pedigrees, and then sampling-using a variety of permutation schemes-alleles from empirical data into those haplotype blocks. We demonstrate the functionality of gscramble with both simulated and empirical data sets and highlight additional uses of the package that users may find valuable.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":" ","pages":"e14069"},"PeriodicalIF":5.5,"publicationDate":"2025-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142968815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}