SANS ambages: phylogenomics with abundance-filter, multi-threading, and bootstrapping on amino-acid or genomic sequences.

IF 3.3 3区生物学 Q2 BIOCHEMICAL RESEARCH METHODS

BMC Bioinformatics Pub Date : 2025-09-02 DOI:10.1186/s12859-025-06204-2

Fabian Kolesch, Marco Sohn, Andreas Rempel, Pia Hippel, Roland Wittler

{"title":"SANS ambages: phylogenomics with abundance-filter, multi-threading, and bootstrapping on amino-acid or genomic sequences.","authors":"Fabian Kolesch, Marco Sohn, Andreas Rempel, Pia Hippel, Roland Wittler","doi":"10.1186/s12859-025-06204-2","DOIUrl":null,"url":null,"abstract":"Background: The increasing amount of available genome sequence data enables large-scale comparative studies. A common task is the inference of phylogenies- a challenging task if close reference sequences are not available, genome sequences are incompletely assembled, or the high number of genomes precludes multiple sequence alignment in reasonable time. SANS is an alignment-free, whole-genome based approach for phylogeny estimation.Results: Here we present a new implementation SANS ambages with a significantly increased application spectrum. It offers additional types of input data, parallelized processing, and bootstrapping. The source code (C++), documentation, and example data are freely available for download at: https://github.com/gi-bielefeld/sans . SANS can also be launched via the web-interface of the CloWM platform- free of charge, with a standard Life Science account: https://clowm.bi.denbi.de/workflows/0194b78f-9696-7402-a2b8-858508733618/ .Conclusions: The new version not only shortens processing time on large datasets immensely by parallelization. Being able to also process amino acid sequences and offering a filter for low-abundant DNA read segments also enables new application cases. Bootstrapping and integrated visualization ease and enrich the interpretation of the resulting phylogenies.","PeriodicalId":8958,"journal":{"name":"BMC Bioinformatics","volume":"26 1","pages":"227"},"PeriodicalIF":3.3000,"publicationDate":"2025-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12403963/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s12859-025-06204-2","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}

引用次数: 0

Abstract

Background: The increasing amount of available genome sequence data enables large-scale comparative studies. A common task is the inference of phylogenies- a challenging task if close reference sequences are not available, genome sequences are incompletely assembled, or the high number of genomes precludes multiple sequence alignment in reasonable time. SANS is an alignment-free, whole-genome based approach for phylogeny estimation.

Results: Here we present a new implementation SANS ambages with a significantly increased application spectrum. It offers additional types of input data, parallelized processing, and bootstrapping. The source code (C++), documentation, and example data are freely available for download at: https://github.com/gi-bielefeld/sans . SANS can also be launched via the web-interface of the CloWM platform- free of charge, with a standard Life Science account: https://clowm.bi.denbi.de/workflows/0194b78f-9696-7402-a2b8-858508733618/ .

Conclusions: The new version not only shortens processing time on large datasets immensely by parallelization. Being able to also process amino acid sequences and offering a filter for low-abundant DNA read segments also enables new application cases. Bootstrapping and integrated visualization ease and enrich the interpretation of the resulting phylogenies.

Abstract Image

查看原文本刊更多论文

SANS ambages：系统基因组学与丰度过滤器，多线程，和自举氨基酸或基因组序列。

背景：可用的基因组序列数据量的增加使大规模的比较研究成为可能。一个常见的任务是系统发生的推断——如果没有接近的参考序列，基因组序列不完全组装，或者基因组数量多，无法在合理的时间内进行多序列比对，这是一项具有挑战性的任务。SANS是一种无比对，基于全基因组的系统发育估计方法。结果：在这里，我们提出了一个新的实现，具有显著增加的应用范围。它提供了额外类型的输入数据、并行处理和自引导。源代码（c++）、文档和示例数据可以从https://github.com/gi-bielefeld/sans免费下载。SANS也可以通过CloWM平台的网络界面启动-免费，使用标准的生命科学帐户：https://clowm.bi.denbi.de/workflows/0194b78f-9696-7402-a2b8-858508733618/。结论：新版本不仅通过并行化极大地缩短了大型数据集的处理时间。能够处理氨基酸序列并为低丰度DNA读段提供过滤器也使新的应用案例成为可能。引导和集成可视化简化并丰富了对系统发生的解释。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

BMC Bioinformatics 生物-生化研究方法

CiteScore

5.70

自引率

3.30%

发文量

506

审稿时长

4.3 months

期刊介绍： BMC Bioinformatics is an open access, peer-reviewed journal that considers articles on all aspects of the development, testing and novel application of computational and statistical methods for the modeling and analysis of all kinds of biological data, as well as other areas of computational biology. BMC Bioinformatics is part of the BMC series which publishes subject-specific journals focused on the needs of individual research communities across all areas of biology and medicine. We offer an efficient, fair and friendly peer review service, and are committed to publishing all sound science, provided that there is some advance in knowledge presented by the work.