C'esar Mart'inez-Guardiola, Noam Brown, Fernando Silva-Coira, Dominik Koppl, T. Gagie, Susana Ladra
{"title":"Augmented Thresholds for MONI","authors":"C'esar Mart'inez-Guardiola, Noam Brown, Fernando Silva-Coira, Dominik Koppl, T. Gagie, Susana Ladra","doi":"10.1109/DCC55655.2023.00035","DOIUrl":null,"url":null,"abstract":"MONI (Rossi et al., 2022) can store a pangenomic dataset T in small space and later, given a pattern P, quickly find the maximal exact matches (MEMs) of P with respect to T. In this paper we consider its one-pass version (Boucher et al., 2021), whose query times are dominated in our experiments by longest common extension (LCE) queries. We show how a small modification lets us avoid most of these queries which significantly speeds up MONI in practice while only slightly increasing its size.","PeriodicalId":209029,"journal":{"name":"2023 Data Compression Conference (DCC)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 Data Compression Conference (DCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DCC55655.2023.00035","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
MONI (Rossi et al., 2022) can store a pangenomic dataset T in small space and later, given a pattern P, quickly find the maximal exact matches (MEMs) of P with respect to T. In this paper we consider its one-pass version (Boucher et al., 2021), whose query times are dominated in our experiments by longest common extension (LCE) queries. We show how a small modification lets us avoid most of these queries which significantly speeds up MONI in practice while only slightly increasing its size.
MONI (Rossi et al., 2022)可以在小空间中存储一个全基因组数据集T,然后,在给定模式P的情况下,快速找到P相对于T的最大精确匹配(MEMs)。在本文中,我们考虑它的一次通过版本(Boucher et al., 2021),其查询时间在我们的实验中由最长公共扩展(LCE)查询主导。我们展示了一个小的修改如何让我们避免大多数这样的查询,这在实践中大大加快了MONI,而只是稍微增加了它的大小。