SatXplor-a comprehensive pipeline for satellite DNA analyses in complex genome assemblies.

IF 6.8 2区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS
Marin Volarić, Nevenka Meštrović, Evelin Despot-Slade
{"title":"SatXplor-a comprehensive pipeline for satellite DNA analyses in complex genome assemblies.","authors":"Marin Volarić, Nevenka Meštrović, Evelin Despot-Slade","doi":"10.1093/bib/bbae660","DOIUrl":null,"url":null,"abstract":"<p><p>Satellite DNAs (satDNAs) are tandemly repeated sequences that make up a significant portion of almost all eukaryotic genomes. Although satDNAs have been shown to play an important role in genome organization and evolution, they are relatively poorly analyzed, even in model organisms. One of the main reasons for the current lack of in-depth studies on satDNAs is their underrepresentation in genome assemblies. Due to complexity, abundance, and highly repetitive nature of satDNAs, their analysis is challenging, requiring efficient tools that ensure accurate annotation and comprehensive genome-wide analysis. We present a novel pipeline, named satellite DNA Exploration (SatXplor), designed to robustly characterize satDNA elements and analyze their arrays and flanking regions. SatXplor is benchmarked against other tools and curated satDNA datasets from diverse species, including mice and humans, showcase its versatility across genomes with varying complexities and satDNA profiles. Component algorithms excel in the identification of tandemly repeated sequences and, for the first time, enable evaluation of satDNA variation and array annotation with the addition of information about surrounding genomic landscape. SatXplor is an innovative pipeline for satDNA analysis that can be paired with any tool used for satDNA detection, offering insights into the structural characteristics, array determination, and genomic context of satDNA elements. By integrating various computational techniques, from sequence analysis and homology investigation to advanced clustering and graph-based methods, it provides a versatile and comprehensive approach to explore the complexity of satDNA organization and understand the underlying mechanisms and evolutionary aspects. It is open-source and freely accessible at https://github.com/mvolar/SatXplor.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 1","pages":""},"PeriodicalIF":6.8000,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11663013/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Briefings in bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/bib/bbae660","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

Satellite DNAs (satDNAs) are tandemly repeated sequences that make up a significant portion of almost all eukaryotic genomes. Although satDNAs have been shown to play an important role in genome organization and evolution, they are relatively poorly analyzed, even in model organisms. One of the main reasons for the current lack of in-depth studies on satDNAs is their underrepresentation in genome assemblies. Due to complexity, abundance, and highly repetitive nature of satDNAs, their analysis is challenging, requiring efficient tools that ensure accurate annotation and comprehensive genome-wide analysis. We present a novel pipeline, named satellite DNA Exploration (SatXplor), designed to robustly characterize satDNA elements and analyze their arrays and flanking regions. SatXplor is benchmarked against other tools and curated satDNA datasets from diverse species, including mice and humans, showcase its versatility across genomes with varying complexities and satDNA profiles. Component algorithms excel in the identification of tandemly repeated sequences and, for the first time, enable evaluation of satDNA variation and array annotation with the addition of information about surrounding genomic landscape. SatXplor is an innovative pipeline for satDNA analysis that can be paired with any tool used for satDNA detection, offering insights into the structural characteristics, array determination, and genomic context of satDNA elements. By integrating various computational techniques, from sequence analysis and homology investigation to advanced clustering and graph-based methods, it provides a versatile and comprehensive approach to explore the complexity of satDNA organization and understand the underlying mechanisms and evolutionary aspects. It is open-source and freely accessible at https://github.com/mvolar/SatXplor.

satxplor -一个全面的管道卫星DNA分析在复杂的基因组组装。
卫星dna (satdna)是串联重复序列,构成了几乎所有真核生物基因组的重要部分。尽管satdna已被证明在基因组组织和进化中发挥重要作用,但对它们的分析相对较少,即使在模式生物中也是如此。目前对satdna缺乏深入研究的主要原因之一是它们在基因组组装中的代表性不足。由于satdna的复杂性、丰度和高度重复性,它们的分析具有挑战性,需要有效的工具来确保准确的注释和全面的全基因组分析。我们提出了一种新的管道,称为卫星DNA探索(SatXplor),旨在稳健地表征卫星DNA元素并分析其阵列和侧翼区域。SatXplor与其他工具和来自不同物种(包括小鼠和人类)的精心策划的卫星dna数据集进行了基准测试,展示了其在具有不同复杂性和卫星dna图谱的基因组中的多功能性。组件算法在串联重复序列的识别方面表现出色,并且首次能够通过添加有关周围基因组景观的信息来评估satDNA变异和阵列注释。SatXplor是一种创新的卫星dna分析管道,可以与任何用于卫星dna检测的工具配对,提供对卫星dna元素的结构特征、阵列确定和基因组背景的见解。通过整合各种计算技术,从序列分析和同源性调查到先进的聚类和基于图的方法,它提供了一个多功能和全面的方法来探索卫星dna组织的复杂性,并了解潜在的机制和进化方面。它是开源的,可以在https://github.com/mvolar/SatXplor上免费访问。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Briefings in bioinformatics
Briefings in bioinformatics 生物-生化研究方法
CiteScore
13.20
自引率
13.70%
发文量
549
审稿时长
6 months
期刊介绍: Briefings in Bioinformatics is an international journal serving as a platform for researchers and educators in the life sciences. It also appeals to mathematicians, statisticians, and computer scientists applying their expertise to biological challenges. The journal focuses on reviews tailored for users of databases and analytical tools in contemporary genetics, molecular and systems biology. It stands out by offering practical assistance and guidance to non-specialists in computerized methodologies. Covering a wide range from introductory concepts to specific protocols and analyses, the papers address bacterial, plant, fungal, animal, and human data. The journal's detailed subject areas include genetic studies of phenotypes and genotypes, mapping, DNA sequencing, expression profiling, gene expression studies, microarrays, alignment methods, protein profiles and HMMs, lipids, metabolic and signaling pathways, structure determination and function prediction, phylogenetic studies, and education and training.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信