Finding Rearrangements in Nanopore DNA Reads with LAST and dnarrange.

Q4 Biochemistry, Genetics and Molecular Biology

Methods in molecular biology Pub Date : 2023-01-01 DOI:10.1007/978-1-0716-2996-3_12

Martin C Frith, Satomi Mitsuhashi

{"title":"Finding Rearrangements in Nanopore DNA Reads with LAST and dnarrange.","authors":"Martin C Frith, Satomi Mitsuhashi","doi":"10.1007/978-1-0716-2996-3_12","DOIUrl":null,"url":null,"abstract":"<p><p>Long-read DNA sequencing techniques such as nanopore are especially useful for characterizing complex sequence rearrangements, which occur in some genetic diseases and also during evolution. Analyzing the sequence data to understand such rearrangements is not trivial, due to sequencing error, rearrangement intricacy, and abundance of repeated similar sequences in genomes.The LAST and dnarrange software packages can resolve complex relationships between DNA sequences and characterize changes such as gene conversion, processed pseudogene insertion, and chromosome shattering. They can filter out numerous rearrangements shared by controls, e.g., healthy humans versus a patient, to focus on rearrangements unique to the patient. One useful ingredient is last-train, which learns the rates (probabilities) of deletions, insertions, and each kind of base match and mismatch. These probabilities are then used to find the most likely sequence relationships/alignments, which is especially useful for DNA with unusual rates, such as DNA from Plasmodium falciparum (malaria) with ∼80% a+t. This is also useful for less-studied species that lack reference genomes, so the DNA reads are compared to a different species' genome. We also point out that a reference genome with ancestral alleles would be ideal.</p>","PeriodicalId":18490,"journal":{"name":"Methods in molecular biology","volume":"2632 ","pages":"161-175"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Methods in molecular biology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/978-1-0716-2996-3_12","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"Biochemistry, Genetics and Molecular Biology","Score":null,"Total":0}

引用次数: 0

Abstract

Long-read DNA sequencing techniques such as nanopore are especially useful for characterizing complex sequence rearrangements, which occur in some genetic diseases and also during evolution. Analyzing the sequence data to understand such rearrangements is not trivial, due to sequencing error, rearrangement intricacy, and abundance of repeated similar sequences in genomes.The LAST and dnarrange software packages can resolve complex relationships between DNA sequences and characterize changes such as gene conversion, processed pseudogene insertion, and chromosome shattering. They can filter out numerous rearrangements shared by controls, e.g., healthy humans versus a patient, to focus on rearrangements unique to the patient. One useful ingredient is last-train, which learns the rates (probabilities) of deletions, insertions, and each kind of base match and mismatch. These probabilities are then used to find the most likely sequence relationships/alignments, which is especially useful for DNA with unusual rates, such as DNA from Plasmodium falciparum (malaria) with ∼80% a+t. This is also useful for less-studied species that lack reference genomes, so the DNA reads are compared to a different species' genome. We also point out that a reference genome with ancestral alleles would be ideal.

查看原文本刊更多论文

利用LAST和dnarrange发现纳米孔DNA Reads中的重排。

像纳米孔这样的长读DNA测序技术对于描述复杂的序列重排特别有用，这种重排发生在一些遗传疾病和进化过程中。由于测序错误、重排复杂性和基因组中大量重复的相似序列，分析序列数据以理解这种重排并非易事。LAST和dnarrange软件包可以解决DNA序列之间的复杂关系，并表征诸如基因转换，加工假基因插入和染色体破碎等变化。它们可以过滤掉控制组共有的许多重排，例如，健康人与患者之间的重排，以专注于患者特有的重排。一个有用的成分是last-train，它学习删除、插入以及每种基本匹配和不匹配的比率(概率)。然后使用这些概率来寻找最可能的序列关系/比对，这对于具有不寻常比率的DNA特别有用，例如来自恶性疟原虫(疟疾)的DNA具有~ 80% a+t。这对于缺乏参考基因组的研究较少的物种也很有用，因此DNA读取与不同物种的基因组进行比较。我们还指出，具有祖先等位基因的参考基因组将是理想的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Methods in molecular biology Biochemistry, Genetics and Molecular Biology-Genetics

CiteScore

2.00

自引率

0.00%

发文量

3536

期刊介绍： For over 20 years, biological scientists have come to rely on the research protocols and methodologies in the critically acclaimed Methods in Molecular Biology series. The series was the first to introduce the step-by-step protocols approach that has become the standard in all biomedical protocol publishing. Each protocol is provided in readily-reproducible step-by-step fashion, opening with an introductory overview, a list of the materials and reagents needed to complete the experiment, and followed by a detailed procedure that is supported with a helpful notes section offering tips and tricks of the trade as well as troubleshooting advice.