在DNA序列中寻找多个重复的优化过滤器

ACS/IEEE International Conference on Computer Systems and Applications - AICCSA 2010 Pub Date : 2010-05-16 DOI:10.1109/AICCSA.2010.5587026

Maria Federico, P. Peterlongo, N. Pisanti

{"title":"在DNA序列中寻找多个重复的优化过滤器","authors":"Maria Federico, P. Peterlongo, N. Pisanti","doi":"10.1109/AICCSA.2010.5587026","DOIUrl":null,"url":null,"abstract":"This paper presents new optimizations designed to improve an algorithm at the state-of-the-art for filtering sequences as a preprocessing step to the task of finding multiple repeats allowing a given pairwise edit distance between pairs of occurrences. The target application is to find possibly long repeats having two or more occurrences, such that each couple of occurrences may show substitutions, insertions or deletions in up to 10 to 15 % of their size. Assimilated to multiple alignment, exact detection of multiple repeats is an NP-hard problem. For increasing computation speed while avoiding the use of heuristics, one may use filters that quickly remove large parts of input that do not contain searched repeats. We describe at theoretical level some optimizations that can be applied to the tool that is currently the state-of-the-art for this filtering task. Finally, we exhibit some experiments in which the optimized tool outperforms its original version.","PeriodicalId":352946,"journal":{"name":"ACS/IEEE International Conference on Computer Systems and Applications - AICCSA 2010","volume":"67 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"An optimized filter for finding multiple repeats in DNA sequences\",\"authors\":\"Maria Federico, P. Peterlongo, N. Pisanti\",\"doi\":\"10.1109/AICCSA.2010.5587026\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents new optimizations designed to improve an algorithm at the state-of-the-art for filtering sequences as a preprocessing step to the task of finding multiple repeats allowing a given pairwise edit distance between pairs of occurrences. The target application is to find possibly long repeats having two or more occurrences, such that each couple of occurrences may show substitutions, insertions or deletions in up to 10 to 15 % of their size. Assimilated to multiple alignment, exact detection of multiple repeats is an NP-hard problem. For increasing computation speed while avoiding the use of heuristics, one may use filters that quickly remove large parts of input that do not contain searched repeats. We describe at theoretical level some optimizations that can be applied to the tool that is currently the state-of-the-art for this filtering task. Finally, we exhibit some experiments in which the optimized tool outperforms its original version.\",\"PeriodicalId\":352946,\"journal\":{\"name\":\"ACS/IEEE International Conference on Computer Systems and Applications - AICCSA 2010\",\"volume\":\"67 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-05-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACS/IEEE International Conference on Computer Systems and Applications - AICCSA 2010\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/AICCSA.2010.5587026\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS/IEEE International Conference on Computer Systems and Applications - AICCSA 2010","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AICCSA.2010.5587026","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

本文提出了新的优化设计，以提高算法在国家的最先进的过滤序列作为一个预处理步骤，以寻找多个重复的任务，允许给定成对的编辑距离对出现。目标应用程序是查找具有两个或多个出现的可能较长的重复，这样每个出现的一对可能显示替换、插入或删除，其大小可达其大小的10%到15%。与多重比对类似，多重重复序列的精确检测是一个np难题。为了在避免使用启发式的同时提高计算速度，可以使用过滤器来快速删除不包含搜索重复的大部分输入。我们在理论层面描述了一些可以应用于该工具的优化，这些工具目前是此过滤任务的最新技术。最后，我们展示了一些实验，其中优化后的工具优于其原始版本。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

An optimized filter for finding multiple repeats in DNA sequences

This paper presents new optimizations designed to improve an algorithm at the state-of-the-art for filtering sequences as a preprocessing step to the task of finding multiple repeats allowing a given pairwise edit distance between pairs of occurrences. The target application is to find possibly long repeats having two or more occurrences, such that each couple of occurrences may show substitutions, insertions or deletions in up to 10 to 15 % of their size. Assimilated to multiple alignment, exact detection of multiple repeats is an NP-hard problem. For increasing computation speed while avoiding the use of heuristics, one may use filters that quickly remove large parts of input that do not contain searched repeats. We describe at theoretical level some optimizations that can be applied to the tool that is currently the state-of-the-art for this filtering task. Finally, we exhibit some experiments in which the optimized tool outperforms its original version.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ACS/IEEE International Conference on Computer Systems and Applications - AICCSA 2010

自引率

0.00%

发文量