{"title":"Construction of Optimal Edit Metric Codes","authors":"S. Houghten, D. Ashlock, J. Lenarz","doi":"10.1109/ITW2.2006.323799","DOIUrl":null,"url":null,"abstract":"The edit distance between two strings is the minimal number of substitutions, deletions, or insertions required to transform one string into another. An error correcting code over the edit metric includes features from deletion-correcting codes as well as the more traditional codes defined using Hamming distance. Applications of edit metric codes include the creation of robust tags over the DNA alphabet. This paper explores the theory underlying edit metric codes for small alphabets. The size of a sphere about a word is heavily dependent on its block structure, or its partition into maximal subwords of a single symbol. This creates a substantial divergence from the theory for the Hamming metric. An optimal code is one with the maximum possible number of codewords for its length and minimum distance. We provide tables of bounds on code sizes for edit codes with short length and small alphabets. We describe issues relating to exhaustive searches and present several heuristics for constructing codes","PeriodicalId":299513,"journal":{"name":"2006 IEEE Information Theory Workshop - ITW '06 Chengdu","volume":"45 15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2006 IEEE Information Theory Workshop - ITW '06 Chengdu","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ITW2.2006.323799","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 16
Abstract
The edit distance between two strings is the minimal number of substitutions, deletions, or insertions required to transform one string into another. An error correcting code over the edit metric includes features from deletion-correcting codes as well as the more traditional codes defined using Hamming distance. Applications of edit metric codes include the creation of robust tags over the DNA alphabet. This paper explores the theory underlying edit metric codes for small alphabets. The size of a sphere about a word is heavily dependent on its block structure, or its partition into maximal subwords of a single symbol. This creates a substantial divergence from the theory for the Hamming metric. An optimal code is one with the maximum possible number of codewords for its length and minimum distance. We provide tables of bounds on code sizes for edit codes with short length and small alphabets. We describe issues relating to exhaustive searches and present several heuristics for constructing codes