{"title":"Metagenomic CRISPR Array Analysis Tool: a novel graph-based approach to finding CRISPR arrays in metagenomic datasets.","authors":"Fikrat Talibli, Björn Voß","doi":"10.1093/femsml/uqaf016","DOIUrl":null,"url":null,"abstract":"<p><p>Clustered Regularly Interspersed Short Palindromic Repeats and CRISPR-associated genes (CRISPR-Cas) is a bacterial immune system also famous for its use in genome editing. The diversity of known systems could be significantly increased by metagenomic data. Here we present the Metagenomic CRISPR Array Analysis Tool (MCAAT), a highly sensitive algorithm for finding CRISPR arrays in unassembled metagenomic data. It takes advantage of the properties of CRISPR arrays that form multicycles in de Bruijn graphs. We show that MCAAT reliably predicts CRISPR arrays in bacterial genome sequences and that its assembly-free graph-based strategy outperforms assembly-based workflows and other assembly-free methods on synthetic and real metagenomes. Our new approach will help to increase the diversity of known CRISPR-Cas systems and enable studies of spacer evolution within metagenomic data sets.</p>","PeriodicalId":74189,"journal":{"name":"microLife","volume":"6 ","pages":"uqaf016"},"PeriodicalIF":0.0000,"publicationDate":"2025-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12342471/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"microLife","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/femsml/uqaf016","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Clustered Regularly Interspersed Short Palindromic Repeats and CRISPR-associated genes (CRISPR-Cas) is a bacterial immune system also famous for its use in genome editing. The diversity of known systems could be significantly increased by metagenomic data. Here we present the Metagenomic CRISPR Array Analysis Tool (MCAAT), a highly sensitive algorithm for finding CRISPR arrays in unassembled metagenomic data. It takes advantage of the properties of CRISPR arrays that form multicycles in de Bruijn graphs. We show that MCAAT reliably predicts CRISPR arrays in bacterial genome sequences and that its assembly-free graph-based strategy outperforms assembly-based workflows and other assembly-free methods on synthetic and real metagenomes. Our new approach will help to increase the diversity of known CRISPR-Cas systems and enable studies of spacer evolution within metagenomic data sets.