Anders K. Krabberød, Embla Stokke, Ella Thoen, Inger Skrede, Håvard Kauserud
{"title":"The Ribosomal Operon Database: A Full-Length rDNA Operon Database Derived From Genome Assemblies","authors":"Anders K. Krabberød, Embla Stokke, Ella Thoen, Inger Skrede, Håvard Kauserud","doi":"10.1111/1755-0998.14031","DOIUrl":"10.1111/1755-0998.14031","url":null,"abstract":"<p>Current rDNA reference sequence databases are tailored towards shorter DNA markers, such as parts of the 16/18S marker or the internally transcribed spacer (ITS) region. However, due to advances in long-read DNA sequencing technologies, longer stretches of the rDNA operon are increasingly used in environmental sequencing studies to increase the phylogenetic resolution. There is, therefore, a growing need for longer rDNA reference sequences. Here, we present the ribosomal operon database (ROD), which includes eukaryotic full-length rDNA operons fished from publicly available genome assemblies. Full-length operons were detected in 34.1% of the 34,701 examined eukaryotic genome assemblies from NCBI. In most cases (53.1%), more than one operon variant was detected, which can be due to intragenomic operon copy variability, allelic variation in non-haploid genomes, or technical errors from the sequencing and assembly process. The highest copy number found was 5947 in Zea mays. In total, 453,697 unique operons were detected, with 69,480 operon variant clusters remaining after intragenomic clustering at 99% sequence identity. The operon length varied extensively across eukaryotes, ranging from 4136 to 16,463 bp, which will lead to considerable polymerase chain reaction (PCR) bias during amplification of the entire operon. Clustering the full-length operons revealed that the different parts (i.e., 18S, 28S, and the hypervariable regions V4 and V9 of 18S) provide divergent taxonomic resolution, with 18S, the V4 and V9 regions being the most conserved. The ROD will be updated regularly to provide an increasing number of full-length rDNA operons to the scientific community.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"25 1","pages":""},"PeriodicalIF":5.5,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.14031","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142454256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Revisiting the Briggs Ancient DNA Damage Model: A Fast Maximum Likelihood Method to Estimate Post-Mortem Damage","authors":"Lei Zhao, Rasmus Amund Henriksen, Abigail Ramsøe, Rasmus Nielsen, Thorfinn Sand Korneliussen","doi":"10.1111/1755-0998.14029","DOIUrl":"10.1111/1755-0998.14029","url":null,"abstract":"<p>One essential initial step in the analysis of ancient DNA is to authenticate that the DNA sequencing reads are actually from ancient DNA. This is done by assessing if the reads exhibit typical characteristics of post-mortem damage (PMD), including cytosine deamination and nicks. We present a novel statistical method implemented in a fast multithreaded programme, ngsBriggs that enables rapid quantification of PMD by estimation of the Briggs ancient damage model parameters (Briggs parameters). Using a multinomial model with maximum likelihood fit, ngsBriggs accurately estimates the parameters of the Briggs model, quantifying the PMD signal from single and double-stranded DNA regions. We extend the original Briggs model to capture PMD signals for contemporary sequencing platforms and show that ngsBriggs accurately estimates the Briggs parameters across a variety of contamination levels. Classification of reads into ancient or modern reads, for the purpose of decontamination, is significantly more accurate using ngsBriggs than using other methods available. Furthermore, ngsBriggs is substantially faster than other state-of-the-art methods. ngsBriggs offers a practical and accurate method for researchers seeking to authenticate ancient DNA and improve the quality of their data.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"25 1","pages":""},"PeriodicalIF":5.5,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.14029","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142454254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lucas André Blattner, Pierre Lapellegerie, Colin Courtney-Mustaphi, Oliver Heiri
{"title":"Sediment Core DNA-Metabarcoding and Chitinous Remain Identification: Integrating Complementary Methods to Characterise Chironomidae Biodiversity in Lake Sediment Archives","authors":"Lucas André Blattner, Pierre Lapellegerie, Colin Courtney-Mustaphi, Oliver Heiri","doi":"10.1111/1755-0998.14035","DOIUrl":"10.1111/1755-0998.14035","url":null,"abstract":"<p>Chironomidae, so-called non-biting midges, are considered key bioindicators of aquatic ecosystem variability. Data derived from morphologically identifying their chitinous remains in sediments document chironomid larvae assemblages, which are studied to reconstruct ecosystem changes over time. Recent developments in sedimentary DNA (sedDNA) research have demonstrated that molecular techniques are suitable for determining past and present occurrences of organisms. Nevertheless, sedDNA records documenting alterations in chironomid assemblages remain largely unexplored. To close this gap, we examined the applicability of sedDNA metabarcoding to identify Chironomidae assemblages in lake sediments by sampling and processing three 21–35 cm long sediment cores from Lake Sempach in Switzerland. With a focus on developing analytical approaches, we compared an invertebrate-universal (FWH) and a newly designed Chironomidae-specific metabarcoding primer set (CH) to assess their performance in detecting Chironomidae DNA. We isolated and identified chitinous larval remains and compared the morphotype assemblages with the data derived from sedDNA metabarcoding. Results showed a good overall agreement of the morphotype assemblage-specific clustering among the chitinous remains and the metabarcoding datasets. Both methods indicated higher chironomid assemblage similarity between the two littoral cores in contrast to the deep lake core. Moreover, we observed a pronounced primer bias effect resulting in more Chironomidae detections with the CH primer combination compared to the FWH combination. Overall, we conclude that sedDNA metabarcoding can supplement traditional remain identifications and potentially provide independent reconstructions of past chironomid assemblage changes. Furthermore, it has the potential of more efficient workflows, better sample standardisation and species-level resolution datasets.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"25 1","pages":""},"PeriodicalIF":5.5,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.14035","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142454255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Correction to “A dedicated target capture approach reveals variable genetic markers across micro-and macro-evolutionary time scales in palms”","authors":"","doi":"10.1111/1755-0998.14032","DOIUrl":"10.1111/1755-0998.14032","url":null,"abstract":"<p>de La Harpe, M., J. Hess, O. Loiseau, N. Salamin, C. Lexer, and M. Paris. 2019. A Dedicated Target Capture Approach Reveals Variable Genetic Markers Across Micro-and Macroevolutionary Time Scales in Palms. Molecular Ecology Resources, 19(1): 221–234. https://doi.org/10.1111/1755-0998.12945</p><p>For the sake of completeness, the authors wish to provide more detailed information in the acknowledgements section about the samples collection, with a more exhaustive list of researchers and students involved in the field sampling. In addition, we correct an inaccuracy in the funding information, replacing ‘Swiss National Science Foundation (SNSF), Grant/Award Number: CRSII3_147630; University of Zurich; Illumina; National Science Foundation’ by ‘Swiss National Science Foundation (SNSF), Grant/Award Number: CRSII3_147630; University of Fribourg’.</p><p>The updated Acknowledgement section is detailed below:</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"25 1","pages":""},"PeriodicalIF":5.5,"publicationDate":"2024-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.14032","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142454252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alexandrine Daniel, Paul Savary, Jean-Christophe Foltête, Gilles Vuidel, Bruno Faivre, Stéphane Garnier, Aurélie Khimoun
{"title":"What can optimized cost distances based on genetic distances offer? A simulation study on the use and misuse of ResistanceGA","authors":"Alexandrine Daniel, Paul Savary, Jean-Christophe Foltête, Gilles Vuidel, Bruno Faivre, Stéphane Garnier, Aurélie Khimoun","doi":"10.1111/1755-0998.14024","DOIUrl":"10.1111/1755-0998.14024","url":null,"abstract":"<p>Modelling population connectivity is central to biodiversity conservation and often relies on resistance surfaces reflecting multi-generational gene flow. ResistanceGA (RGA) is a common optimization framework for parameterizing these surfaces by maximizing the fit between genetic distances and cost distances using maximum likelihood population effect models. As the reliability of this framework has rarely been studied, we investigated the conditions maximizing its accuracy for both prediction and interpretation of landscape features' permeability. We ran demo-genetic simulations in contrasted landscapes for species with distinct dispersal capacities and specialization levels, using corresponding reference cost scenarios. We then optimized resistance surfaces from the simulated genetic distances using RGA. First, we evaluated whether RGA identified the drivers of the genetic patterns, that is, distinguished Isolation-by-Resistance (IBR) patterns from either Isolation-by-Distance or patterns unrelated to ecological distances. We then assessed RGA predictive performance using a cross-validation method, and its ability to recover the reference cost scenarios shaping genetic structure in simulations. IBR patterns were well detected and genetic distances were predicted with great accuracy. This performance depended on the strength of the genetic structuring, sampling design and landscape structure. Matching the scale of the genetic pattern by focusing on population pairs connected through gene flow and limiting overfitting through cross-validation further enhanced inference reliability. Yet, the optimized cost values often departed from the reference values, making their interpretation and extrapolation potentially dubious. While demonstrating the value of RGA for predictive modelling, we call for caution and provide additional guidance for its optimal use.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"25 1","pages":""},"PeriodicalIF":5.5,"publicationDate":"2024-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.14024","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142454258","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. Antonio Baeza, Jeremiah J. Minish, Todd P. Michael
{"title":"Assembly of Mitochondrial Genomes Using Nanopore Long-Read Technology in Three Sea Chubs (Teleostei: Kyphosidae)","authors":"J. Antonio Baeza, Jeremiah J. Minish, Todd P. Michael","doi":"10.1111/1755-0998.14034","DOIUrl":"10.1111/1755-0998.14034","url":null,"abstract":"<p>Complete mitochondrial genomes have become markers of choice to explore phylogenetic relationships at multiple taxonomic levels and they are often assembled using whole genome short-read sequencing. Herein, using three species of sea chubs as an example, we explored the accuracy of mitochondrial chromosomes assembled using Oxford Nanopore Technology (ONT) Kit 14 R10.4.1 long reads at different sequencing depths (high, low and very low or genome skimming) by comparing them to ‘gold’ standard reference mitochondrial genomes assembled using Illumina NovaSeq short reads. In two species of sea chubs, <i>Girella nigricans</i> and <i>Kyphosus azureus</i>, ONT long-read assembled mitochondrial genomes at high sequencing depths (> 25× whole [nuclear] genome) were identical to their respective short-read assembled mitochondrial genomes. Not a single ‘homopolymer insertion’, ‘homopolymer deletion’, ‘simple substitution’, ‘single insertion’, ‘short insertion’, ‘single deletion’ or ‘short deletion’ were detected in the long-read assembled mitochondrial genomes after aligning each one of them to their short-read counterparts. In turn, in a third species, <i>Medialuna californiensis</i>, a 25× sequencing depth long-read assembled mitochondrial genome was 14 nucleotides longer than its short-read counterpart. The difference in total length between the latter two assemblies was due to the presence of a short motif 14 bp long that was repeated (twice) in the long read but not in the short-read assembly. Read subsampling at a sequencing depth of 1× resulted in the assembly of partial or complete mitochondrial genomes with numerous errors, including, among others, simple indels, and indels at homopolymer regions. At 3× and 5× subsampling, genomes were identical (perfect) or almost identical (quasiperfect, 99.5% over 16,500 bp) to their respective Illumina assemblies. The newly assembled mitochondrial genomes exhibit identical gene composition and organisation compared with cofamilial species and a phylomitogenomic analysis based on translated protein-coding genes suggested that the family Kyphosidae is not monophyletic. The same analysis detected possible cases of misidentification of mitochondrial genomes deposited in GenBank. This study demonstrates that perfect (complete and fully accurate) or quasiperfect (complete but with a single or a very few errors) mitochondrial genomes can be assembled at high (> 25×) and low (3–5×) but not very low (1×, genome skimming) sequencing depths using ONT long reads and the latest ONT chemistries (Kit 14 and R10.4.1 flowcells with SUP basecalling). The newly assembled and annotated mitochondrial genomes can be used as a reference in environmental DNA studies focusing on bioprospecting and biomonitoring of these and other coastal species experiencing environmental insult. Given the small size of the sequencing device and low cost, we argue that ONT technology has the potential to improve access to high-throughput sequenc","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"25 1","pages":""},"PeriodicalIF":5.5,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.14034","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142454250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Paul D. N. Hebert, Robin Floyd, Saeideh Jafarpour, Sean W. J. Prosser
{"title":"Barcode 100K Specimens: In a Single Nanopore Run","authors":"Paul D. N. Hebert, Robin Floyd, Saeideh Jafarpour, Sean W. J. Prosser","doi":"10.1111/1755-0998.14028","DOIUrl":"10.1111/1755-0998.14028","url":null,"abstract":"<p>It is a global priority to better manage the biosphere, but action must be informed by comprehensive data on the abundance and distribution of species. The acquisition of such information is currently constrained by high costs. DNA barcoding can speed the registration of unknown animal species, the most diverse kingdom of eukaryotes, as the BIN system automates their recognition. However, inexpensive sequencing protocols are critical as the census of all animal species is likely to require the analysis of a billion or more specimens. Barcoding involves DNA extraction followed by PCR and sequencing with the last step dominating costs until 2017. By enabling the sequencing of highly multiplexed samples, the Sequel platforms from Pacific BioSciences slashed costs by 90%, but these instruments are only deployed in core facilities because of their expense. Sequencers from Oxford Nanopore Technologies provide an escape from high capital and service costs, but their low sequence fidelity has, until recently, constrained adoption. However, the improved performance of its latest flow cells (R10.4.1) erases this barrier. This study demonstrates that a MinION flow cell can characterise an amplicon pool derived from 100,000 specimens while a Flongle flow cell can process one derived from several thousand. At $0.01 per specimen, DNA sequencing is now the least expensive step in the barcode workflow.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"25 1","pages":""},"PeriodicalIF":5.5,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.14028","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142454251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jessen Havill, Olivia Strasburg, Tessy Udoh, Jacob E. Crawford, Andrea Gloria-Soria
{"title":"EVE-X: Software to Identify Novel Viral Insertions in Wild-Caught Arthropod Hosts From Next-Generation Short Read Data","authors":"Jessen Havill, Olivia Strasburg, Tessy Udoh, Jacob E. Crawford, Andrea Gloria-Soria","doi":"10.1111/1755-0998.14026","DOIUrl":"10.1111/1755-0998.14026","url":null,"abstract":"<div>\u0000 \u0000 <p>Eukaryotic genomes harbour sequences derived from non-retroviral RNA viruses, known as endogenous viral elements (EVEs) or non-retroviral integrated RNA virus sequences (NIRVS). These sequences represent a record of past infections and have been implicated in host anti-viral response. We have created a program to identify viral sequences integrated in a host genome. It begins with a specimen BAM file and outputs candidate NIRVS, along with putative host insertion sites and overlapping genomic features of the host genome in XML and visual formats, with minimal intermediary intervention. We ran through this software short-read data derived from the genomes of 222 wild-caught <i>A. aegypti</i> mosquitoes, from a dozen geographical regions, and located putative NIRVS from seven virus families. This program is as accurate as currently available software for NIRVS detection, and represents a significant improvement in adaptability and user-friendliness. Furthermore, the flexibility of this pipeline allows the user to search for sequence integrations across the genome of any organism, as long as a query sequence database and a reference genome is provided. Potential extended applications include identification of integrated transgenic sequences used for research or vector control strategies.</p>\u0000 </div>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"25 1","pages":""},"PeriodicalIF":5.5,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142386792","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tiffany A. Kosch, Andrew J. Crawford, Rachel Lockridge Mueller, Katharina C. Wollenberg Valero, Megan L. Power, Ariel Rodríguez, Lauren A. O'Connell, Neil D. Young, Lee F. Skerratt
{"title":"Comparative analysis of amphibian genomes: An emerging resource for basic and applied research","authors":"Tiffany A. Kosch, Andrew J. Crawford, Rachel Lockridge Mueller, Katharina C. Wollenberg Valero, Megan L. Power, Ariel Rodríguez, Lauren A. O'Connell, Neil D. Young, Lee F. Skerratt","doi":"10.1111/1755-0998.14025","DOIUrl":"10.1111/1755-0998.14025","url":null,"abstract":"<p>Amphibians are the most threatened group of vertebrates and are in dire need of conservation intervention to ensure their continued survival. They exhibit unique features including a high diversity of reproductive strategies, permeable and specialized skin capable of producing toxins and antimicrobial compounds, multiple genetic mechanisms of sex determination and in some lineages, the ability to regenerate limbs and organs. Although genomic approaches would shed light on these unique traits and aid conservation, sequencing and assembly of amphibian genomes has lagged behind other taxa due to their comparatively large genome sizes. Fortunately, the development of long-read sequencing technologies and initiatives has led to a recent burst of new amphibian genome assemblies. Although growing, the field of amphibian genomics suffers from the lack of annotation resources, tools for working with challenging genomes and lack of high-quality assemblies in multiple clades of amphibians. Here, we analyse 51 publicly available amphibian genomes to evaluate their usefulness for functional genomics research. We report considerable variation in genome assembly quality and completeness and report some of the highest transposable element and repeat contents of any vertebrate. Additionally, we detected an association between transposable element content and climatic variables. Our analysis provides evidence of conserved genome synteny despite the long divergence times of this group, but we also highlight inconsistencies in chromosome naming and orientation across genome assemblies. We discuss sequencing gaps in the phylogeny and suggest key targets for future sequencing endeavours. Finally, we propose increased investment in amphibian genomics research to promote their conservation.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"25 1","pages":""},"PeriodicalIF":5.5,"publicationDate":"2024-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.14025","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142370447","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Julia C. Geue, Peng Liu, Sonesinh Keobouasone, Paul Wilson, Micheline Manseau
{"title":"MhGeneS: An Analytical Pipeline to Allow for Robust Microhaplotype Genotyping","authors":"Julia C. Geue, Peng Liu, Sonesinh Keobouasone, Paul Wilson, Micheline Manseau","doi":"10.1111/1755-0998.14027","DOIUrl":"10.1111/1755-0998.14027","url":null,"abstract":"<p>Microhaplotypes are small linked genomic regions comprising two or more single-nucleotide polymorphisms (SNPs) that are being applied in forensics and are emerging in wildlife monitoring studies and genomic epidemiology. Typically, targeted in non-coding regions, microhaplotypes in exonic regions can be designed with larger amplicons to capture functional non-synonymous sites and minimise insertion/deletion (indel) polymorphisms. Quality control is an important first step for high-confidence genotyping to counteract such false-positive variants. As genetic markers with higher polymorphism compared to biallelic SNPs, it is critical to ensure sequencing errors across the microhaplotype amplicon are filtered out to avoid introducing false-haplotypes. We developed the MhGeneS pipeline which works in tandem with Seq2Sat to help validate microhaplotype genotyping of the coding region of genes, with broader applicability to any microhaplotype profiling. We genotyped microhaplotype regions of the <i>Zfx</i> (≅ 160 bp) and <i>Zfy</i> (≅ 140 bp) genes, as well as an exon of the prion protein (<i>Prnp</i>) gene (≅ 370 bp) in caribou (<i>Rangifer tarandus</i>) using paired-end Illumina technology. As important quality metrics affecting microhaplotype calling, we identified the sequencing error rate profile related to the overlap or non-overlap of paired-end reads as well as the read depth as significant. In the case of <i>Prnp</i>, we achieved confident microhaplotype calling through MhGeneS by removing small sections of the 5′ and 3′ amplicons and using a minimum read depth of 20. Read depth and sequence trimming may be locus-specific, and validation of these parameters is recommended before the high-throughput profiling of samples.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"25 1","pages":""},"PeriodicalIF":5.5,"publicationDate":"2024-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.14027","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142370448","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}