{"title":"Private record linkage with linkage maps","authors":"Shreya Patel, Rinku Dewri","doi":"10.1002/spy2.265","DOIUrl":null,"url":null,"abstract":"Private record linkage is an actively pursued research area to facilitate the linkage of database records under the constraints of regulations that do not allow linkage agents to learn sensitive identities of record owners. Recent works have shown that linkage using commutative ciphers, which were discarded earlier for efficiency concerns, can be made feasible by leveraging precomputations, data parallelism, and probabilistic key reuse approaches. In this work, we propose further optimizations that can be performed to improve the runtime efficiency of such an approach. We transition from modular exponentiation ciphers to elliptic curve operations to improve precomputation time, eliminate memory intensive comparisons of encrypted values, and introduce data structures to detect negative comparisons. We benchmark the proposed approach using real world demographics data, and provide an extensive study of the parametric aspects of the approach. We also supplement our execution time results with an assessment of the residual privacy risk left by the approach. The approach can perform a linkage of two datasets with 105 records each in 20 minutes in a commodity laptop. This is achieved by eliminating the need to compare more than 70% of the record pairs. By design, the linkage accuracy is also retained at the same level as a nonprivate record linkage procedure.","PeriodicalId":29939,"journal":{"name":"Security and Privacy","volume":" ","pages":""},"PeriodicalIF":1.5000,"publicationDate":"2022-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Security and Privacy","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/spy2.265","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Private record linkage is an actively pursued research area to facilitate the linkage of database records under the constraints of regulations that do not allow linkage agents to learn sensitive identities of record owners. Recent works have shown that linkage using commutative ciphers, which were discarded earlier for efficiency concerns, can be made feasible by leveraging precomputations, data parallelism, and probabilistic key reuse approaches. In this work, we propose further optimizations that can be performed to improve the runtime efficiency of such an approach. We transition from modular exponentiation ciphers to elliptic curve operations to improve precomputation time, eliminate memory intensive comparisons of encrypted values, and introduce data structures to detect negative comparisons. We benchmark the proposed approach using real world demographics data, and provide an extensive study of the parametric aspects of the approach. We also supplement our execution time results with an assessment of the residual privacy risk left by the approach. The approach can perform a linkage of two datasets with 105 records each in 20 minutes in a commodity laptop. This is achieved by eliminating the need to compare more than 70% of the record pairs. By design, the linkage accuracy is also retained at the same level as a nonprivate record linkage procedure.