T. C. Theodosiou, Dimitrios Karapiperis, Vassilios S. Verykios
{"title":"Using Wavelets for Matching Records Privately","authors":"T. C. Theodosiou, Dimitrios Karapiperis, Vassilios S. Verykios","doi":"10.1145/3139367.3139371","DOIUrl":null,"url":null,"abstract":"This paper presents a wavelet-based methodology for performing privacy preserving record linkage. The proposed methodology is introduced in a bottom-up approach, starting from simple text matching and extending to actual record linkage. The discrete wavelet transform, along with some privacy preserving operations, is employed to cast text into a numerical sequence of fixed length. Database records are then treated as collections of such numerical sequences. Practical examples and implementation details are provided during all development phases. The method is applied on simulated data of bibliographic records, and results demonstrate that performance is comparable to other successful methodologies.","PeriodicalId":436862,"journal":{"name":"Proceedings of the 21st Pan-Hellenic Conference on Informatics","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 21st Pan-Hellenic Conference on Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3139367.3139371","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This paper presents a wavelet-based methodology for performing privacy preserving record linkage. The proposed methodology is introduced in a bottom-up approach, starting from simple text matching and extending to actual record linkage. The discrete wavelet transform, along with some privacy preserving operations, is employed to cast text into a numerical sequence of fixed length. Database records are then treated as collections of such numerical sequences. Practical examples and implementation details are provided during all development phases. The method is applied on simulated data of bibliographic records, and results demonstrate that performance is comparable to other successful methodologies.