近似字符串匹配在亚线性的期望时间

Proceedings [1990] 31st Annual Symposium on Foundations of Computer Science Pub Date : 1990-10-22 DOI:10.1109/FSCS.1990.89530

W. I. Chang, E. Lawler

{"title":"近似字符串匹配在亚线性的期望时间","authors":"W. I. Chang, E. Lawler","doi":"10.1109/FSCS.1990.89530","DOIUrl":null,"url":null,"abstract":"The k differences approximate string matching problem specifies a text string of length n, a pattern string of length m, and the number k of differences (insertions, deletions, substitutions) allowed in a match, and asks for every location in the text where a match occurs. Previous algorithms required at least O(nk) time. When k is as large as a fraction of m, no substantial progress has been made over O(nm) dynamic programming. The authors have investigated much faster algorithms for restricted cases of the problem, such as when the text string is random and errors are not too frequent. They have devised an algorithm that, for k<m/log n+O(1), runs in time O((n/m)k log n) on the average. In the worst case their algorithm is O(nk), but it is still an improvement in that it is very practical and uses only O(n) space compared with O(n) or O(n/sup 2/). The authors define an approximate substring matching problem and give efficient algorithms based on their techniques. Special cases include several applications to genetics and molecular biology.<<ETX>>","PeriodicalId":271949,"journal":{"name":"Proceedings [1990] 31st Annual Symposium on Foundations of Computer Science","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1990-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"111","resultStr":"{\"title\":\"Approximate string matching in sublinear expected time\",\"authors\":\"W. I. Chang, E. Lawler\",\"doi\":\"10.1109/FSCS.1990.89530\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The k differences approximate string matching problem specifies a text string of length n, a pattern string of length m, and the number k of differences (insertions, deletions, substitutions) allowed in a match, and asks for every location in the text where a match occurs. Previous algorithms required at least O(nk) time. When k is as large as a fraction of m, no substantial progress has been made over O(nm) dynamic programming. The authors have investigated much faster algorithms for restricted cases of the problem, such as when the text string is random and errors are not too frequent. They have devised an algorithm that, for k<m/log n+O(1), runs in time O((n/m)k log n) on the average. In the worst case their algorithm is O(nk), but it is still an improvement in that it is very practical and uses only O(n) space compared with O(n) or O(n/sup 2/). The authors define an approximate substring matching problem and give efficient algorithms based on their techniques. Special cases include several applications to genetics and molecular biology.<<ETX>>\",\"PeriodicalId\":271949,\"journal\":{\"name\":\"Proceedings [1990] 31st Annual Symposium on Foundations of Computer Science\",\"volume\":\"8 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1990-10-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"111\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings [1990] 31st Annual Symposium on Foundations of Computer Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/FSCS.1990.89530\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings [1990] 31st Annual Symposium on Foundations of Computer Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FSCS.1990.89530","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 111

摘要

k个差异近似字符串匹配问题指定一个长度为n的文本字符串，一个长度为m的模式字符串，以及匹配中允许的k个差异(插入、删除、替换)，并要求在文本中出现匹配的每个位置。以前的算法至少需要O(nk)时间。当k大到m的一个分数时，0 (nm)动态规划没有实质性进展。作者已经研究了用于该问题的有限情况的更快的算法，例如当文本字符串是随机的并且错误不太频繁时。他们设计了一个算法，对于k bb0

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Approximate string matching in sublinear expected time

The k differences approximate string matching problem specifies a text string of length n, a pattern string of length m, and the number k of differences (insertions, deletions, substitutions) allowed in a match, and asks for every location in the text where a match occurs. Previous algorithms required at least O(nk) time. When k is as large as a fraction of m, no substantial progress has been made over O(nm) dynamic programming. The authors have investigated much faster algorithms for restricted cases of the problem, such as when the text string is random and errors are not too frequent. They have devised an algorithm that, for k>

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings [1990] 31st Annual Symposium on Foundations of Computer Science

自引率

0.00%

发文量