{"title":"Towards Split Learning-based Privacy-Preserving Record Linkage","authors":"Michail Zervas, Alexandros Karakasidis","doi":"arxiv-2409.01088","DOIUrl":null,"url":null,"abstract":"Split Learning has been recently introduced to facilitate applications where\nuser data privacy is a requirement. However, it has not been thoroughly studied\nin the context of Privacy-Preserving Record Linkage, a problem in which the\nsame real-world entity should be identified among databases from different\ndataholders, but without disclosing any additional information. In this paper,\nwe investigate the potentials of Split Learning for Privacy-Preserving Record\nMatching, by introducing a novel training method through the utilization of\nReference Sets, which are publicly available data corpora, showcasing minimal\nmatching impact against a traditional centralized SVM-based technique.","PeriodicalId":501123,"journal":{"name":"arXiv - CS - Databases","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Databases","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.01088","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Split Learning has been recently introduced to facilitate applications where
user data privacy is a requirement. However, it has not been thoroughly studied
in the context of Privacy-Preserving Record Linkage, a problem in which the
same real-world entity should be identified among databases from different
dataholders, but without disclosing any additional information. In this paper,
we investigate the potentials of Split Learning for Privacy-Preserving Record
Matching, by introducing a novel training method through the utilization of
Reference Sets, which are publicly available data corpora, showcasing minimal
matching impact against a traditional centralized SVM-based technique.