Rick Beeloo, Aldert L Zomer, Sebastian Deorowicz, Bas E Dutilh
{"title":"石墨:使用彩色德布鲁因图绘制基因组。","authors":"Rick Beeloo, Aldert L Zomer, Sebastian Deorowicz, Bas E Dutilh","doi":"10.1093/nargab/lqae142","DOIUrl":null,"url":null,"abstract":"<p><p>The recent growth of microbial sequence data allows comparisons at unprecedented scales, enabling the tracking of strains, mobile genetic elements, or genes. Querying a genome against a large reference database can easily yield thousands of matches that are tedious to interpret and pose computational challenges. We developed Graphite that uses a colored de Bruijn graph (cDBG) to paint query genomes, selecting the local best matches along the full query length. By focusing on the best genomic match of each query region, Graphite reduces the number of matches while providing the most promising leads for sequence tracking or genomic forensics. When applied to hundreds of <i>Campylobacter</i> genomes we found extensive gene sharing, including a previously undetected <i>C. coli</i> plasmid that matched a <i>C. jejuni</i> chromosome. Together, genome painting using cDBGs as enabled by Graphite, can reveal new biological phenomena by mitigating computational hurdles.</p>","PeriodicalId":33994,"journal":{"name":"NAR Genomics and Bioinformatics","volume":"6 4","pages":"lqae142"},"PeriodicalIF":4.0000,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11497850/pdf/","citationCount":"0","resultStr":"{\"title\":\"Graphite: painting genomes using a colored de Bruijn graph.\",\"authors\":\"Rick Beeloo, Aldert L Zomer, Sebastian Deorowicz, Bas E Dutilh\",\"doi\":\"10.1093/nargab/lqae142\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>The recent growth of microbial sequence data allows comparisons at unprecedented scales, enabling the tracking of strains, mobile genetic elements, or genes. Querying a genome against a large reference database can easily yield thousands of matches that are tedious to interpret and pose computational challenges. We developed Graphite that uses a colored de Bruijn graph (cDBG) to paint query genomes, selecting the local best matches along the full query length. By focusing on the best genomic match of each query region, Graphite reduces the number of matches while providing the most promising leads for sequence tracking or genomic forensics. When applied to hundreds of <i>Campylobacter</i> genomes we found extensive gene sharing, including a previously undetected <i>C. coli</i> plasmid that matched a <i>C. jejuni</i> chromosome. Together, genome painting using cDBGs as enabled by Graphite, can reveal new biological phenomena by mitigating computational hurdles.</p>\",\"PeriodicalId\":33994,\"journal\":{\"name\":\"NAR Genomics and Bioinformatics\",\"volume\":\"6 4\",\"pages\":\"lqae142\"},\"PeriodicalIF\":4.0000,\"publicationDate\":\"2024-10-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11497850/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"NAR Genomics and Bioinformatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1093/nargab/lqae142\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/9/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q1\",\"JCRName\":\"GENETICS & HEREDITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"NAR Genomics and Bioinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/nargab/lqae142","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/9/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0
摘要
近来微生物序列数据的增长使我们能够以前所未有的规模进行比较,从而追踪菌株、移动遗传因子或基因。根据大型参考数据库查询基因组很容易获得成千上万的匹配结果,而这些匹配结果的解读非常繁琐,并给计算带来了挑战。我们开发的 Graphite 使用彩色 de Bruijn 图(cDBG)来绘制查询基因组,沿着整个查询长度选择局部最佳匹配。通过关注每个查询区域的最佳基因组匹配,Graphite 减少了匹配的数量,同时为序列追踪或基因组取证提供了最有希望的线索。当应用于数百个弯曲杆菌基因组时,我们发现了广泛的基因共享,包括以前未检测到的与空肠弯曲杆菌染色体匹配的大肠杆菌质粒。总之,利用石墨实现的 cDBGs 进行基因组绘制,可以通过减少计算障碍来揭示新的生物现象。
Graphite: painting genomes using a colored de Bruijn graph.
The recent growth of microbial sequence data allows comparisons at unprecedented scales, enabling the tracking of strains, mobile genetic elements, or genes. Querying a genome against a large reference database can easily yield thousands of matches that are tedious to interpret and pose computational challenges. We developed Graphite that uses a colored de Bruijn graph (cDBG) to paint query genomes, selecting the local best matches along the full query length. By focusing on the best genomic match of each query region, Graphite reduces the number of matches while providing the most promising leads for sequence tracking or genomic forensics. When applied to hundreds of Campylobacter genomes we found extensive gene sharing, including a previously undetected C. coli plasmid that matched a C. jejuni chromosome. Together, genome painting using cDBGs as enabled by Graphite, can reveal new biological phenomena by mitigating computational hurdles.