{"title":"Efficient computation of the Damerau-Levenshtein distance between biological sequences","authors":"Chunchun Zhao, S. Sahni","doi":"10.1109/ICCABS.2017.8114295","DOIUrl":null,"url":null,"abstract":"We have developed linear space algorithms to compute the Damerau-Levenshtein (DL) distance [1], [2] between two strings and also to find a sequence of edit operations of length equal to the DL distance (optimal trace). Our algorithms require O(s min{m, n} + m + n) space, where s is the size of the alphabet and m and n are, respectively, the lengths of the two strings. Previously known algorithms require O(mn) space. Cache efficient and multi-core linear-space algorithms have also been developed. The cache miss efficiency of the algorithms was analyzed using a simple cache model.","PeriodicalId":89933,"journal":{"name":"IEEE ... International Conference on Computational Advances in Bio and Medical Sciences : [proceedings]. IEEE International Conference on Computational Advances in Bio and Medical Sciences","volume":"1 1","pages":"1"},"PeriodicalIF":0.0000,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE ... International Conference on Computational Advances in Bio and Medical Sciences : [proceedings]. IEEE International Conference on Computational Advances in Bio and Medical Sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCABS.2017.8114295","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
We have developed linear space algorithms to compute the Damerau-Levenshtein (DL) distance [1], [2] between two strings and also to find a sequence of edit operations of length equal to the DL distance (optimal trace). Our algorithms require O(s min{m, n} + m + n) space, where s is the size of the alphabet and m and n are, respectively, the lengths of the two strings. Previously known algorithms require O(mn) space. Cache efficient and multi-core linear-space algorithms have also been developed. The cache miss efficiency of the algorithms was analyzed using a simple cache model.