T. Ahmad, Kukuh Indrayana, W. Wibisono, R. Ijtihadie
{"title":"在同态加密数据上使用语音和排版字母分组编辑距离加权修改","authors":"T. Ahmad, Kukuh Indrayana, W. Wibisono, R. Ijtihadie","doi":"10.1109/ICSITECH.2017.8257147","DOIUrl":null,"url":null,"abstract":"Edit Distance string matching algorithm gives same weight for every single mismatching character. In fact, mismatching can be caused by phonetic error, mistyping error, or unknown error. An improvement has been made by Editex which modifies that algorithm. However, it tolerates only the phonetic error. In this paper, we increase its performance by proposing new weighting and distance calculation of that algorithm. Here, the source of mismatching is grouped into phonetic and typographic errors. Characters are divided into groups of phoneticity and typography, which have their own weight. By using this letter grouping, our proposed method is also suitable for implementation in homomorphic encrypted data. Experimental results show that this method produces lower false positive rates than the Edit Distance and Editex algorithms. The proposed method generates 2.2 false positives per experiment, while Edit Distance and Editex produce 8.24 and 3.12, respectively. It can be inferred that this proposed method is able to produce a relatively low error rate.","PeriodicalId":165045,"journal":{"name":"2017 3rd International Conference on Science in Information Technology (ICSITech)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Edit distance weighting modification using phonetic and typographic letter grouping over homomorphic encrypted data\",\"authors\":\"T. Ahmad, Kukuh Indrayana, W. Wibisono, R. Ijtihadie\",\"doi\":\"10.1109/ICSITECH.2017.8257147\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Edit Distance string matching algorithm gives same weight for every single mismatching character. In fact, mismatching can be caused by phonetic error, mistyping error, or unknown error. An improvement has been made by Editex which modifies that algorithm. However, it tolerates only the phonetic error. In this paper, we increase its performance by proposing new weighting and distance calculation of that algorithm. Here, the source of mismatching is grouped into phonetic and typographic errors. Characters are divided into groups of phoneticity and typography, which have their own weight. By using this letter grouping, our proposed method is also suitable for implementation in homomorphic encrypted data. Experimental results show that this method produces lower false positive rates than the Edit Distance and Editex algorithms. The proposed method generates 2.2 false positives per experiment, while Edit Distance and Editex produce 8.24 and 3.12, respectively. It can be inferred that this proposed method is able to produce a relatively low error rate.\",\"PeriodicalId\":165045,\"journal\":{\"name\":\"2017 3rd International Conference on Science in Information Technology (ICSITech)\",\"volume\":\"7 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 3rd International Conference on Science in Information Technology (ICSITech)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSITECH.2017.8257147\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 3rd International Conference on Science in Information Technology (ICSITech)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSITECH.2017.8257147","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Edit distance weighting modification using phonetic and typographic letter grouping over homomorphic encrypted data
Edit Distance string matching algorithm gives same weight for every single mismatching character. In fact, mismatching can be caused by phonetic error, mistyping error, or unknown error. An improvement has been made by Editex which modifies that algorithm. However, it tolerates only the phonetic error. In this paper, we increase its performance by proposing new weighting and distance calculation of that algorithm. Here, the source of mismatching is grouped into phonetic and typographic errors. Characters are divided into groups of phoneticity and typography, which have their own weight. By using this letter grouping, our proposed method is also suitable for implementation in homomorphic encrypted data. Experimental results show that this method produces lower false positive rates than the Edit Distance and Editex algorithms. The proposed method generates 2.2 false positives per experiment, while Edit Distance and Editex produce 8.24 and 3.12, respectively. It can be inferred that this proposed method is able to produce a relatively low error rate.