Towards Auditable and Intelligent Privacy-Preserving Record Linkage

Anais Estendidos do XXXVI Simpósio Brasileiro de Banco de Dados (SBBD Estendido 2021) Pub Date : 2021-10-04 DOI:10.5753/sbbd_estendido.2021.18170

T. Nóbrega, Carlos Eduardo S. Pires, D. Nascimento

{"title":"Towards Auditable and Intelligent Privacy-Preserving Record Linkage","authors":"T. Nóbrega, Carlos Eduardo S. Pires, D. Nascimento","doi":"10.5753/sbbd_estendido.2021.18170","DOIUrl":null,"url":null,"abstract":"Privacy-Preserving Record Linkage (PPRL) intends to integrate private/sensitive data from several data sources held by different parties. It aims to identify records (e.g., persons or objects) representing the same real-world entity over private data sources held by different custodians. Due to recent laws and regulations (e.g., General Data Protection Regulation), PPRL approaches are increasingly demanded in real-world application areas such as health care, credit analysis, public policy evaluation, and national security. As a result, the PPRL process needs to deal with efficacy (linkage quality), and privacy problems. For instance, the PPRL process needs to be executed over data sources (e.g., a database containing personal information of governmental income distribution and assistance programs), with an accurate linkage of the entities, and, at the same time, protect the privacy of the information. Thus, this work intends to simplify the PPRL process by facilitating real-world applications (such as medical, epidemiologic, and populational studies) to reduce legal and bureaucratic efforts to access and process the data, making these applications' execution more straightforward for companies and governments. In this context, this work presents two major contributions to PPRL: i) an improvement to the linkage quality and simplify the process by employing Machine Learning techniques to decide whether two records represent the same entity, or not; and ii) we enable the auditability the computations performed during PPRL.","PeriodicalId":232860,"journal":{"name":"Anais Estendidos do XXXVI Simpósio Brasileiro de Banco de Dados (SBBD Estendido 2021)","volume":"71 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Anais Estendidos do XXXVI Simpósio Brasileiro de Banco de Dados (SBBD Estendido 2021)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5753/sbbd_estendido.2021.18170","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Privacy-Preserving Record Linkage (PPRL) intends to integrate private/sensitive data from several data sources held by different parties. It aims to identify records (e.g., persons or objects) representing the same real-world entity over private data sources held by different custodians. Due to recent laws and regulations (e.g., General Data Protection Regulation), PPRL approaches are increasingly demanded in real-world application areas such as health care, credit analysis, public policy evaluation, and national security. As a result, the PPRL process needs to deal with efficacy (linkage quality), and privacy problems. For instance, the PPRL process needs to be executed over data sources (e.g., a database containing personal information of governmental income distribution and assistance programs), with an accurate linkage of the entities, and, at the same time, protect the privacy of the information. Thus, this work intends to simplify the PPRL process by facilitating real-world applications (such as medical, epidemiologic, and populational studies) to reduce legal and bureaucratic efforts to access and process the data, making these applications' execution more straightforward for companies and governments. In this context, this work presents two major contributions to PPRL: i) an improvement to the linkage quality and simplify the process by employing Machine Learning techniques to decide whether two records represent the same entity, or not; and ii) we enable the auditability the computations performed during PPRL.

查看原文本刊更多论文

面向可审计和智能隐私保护的记录链接

隐私保护记录链接(PPRL)旨在集成来自不同各方持有的多个数据源的私人/敏感数据。它的目的是识别不同保管人持有的私有数据源上代表同一真实世界实体的记录(例如，人或对象)。由于最近的法律和法规(例如，《通用数据保护条例》)，在现实世界的应用领域，如医疗保健、信用分析、公共政策评估和国家安全，越来越需要PPRL方法。因此，PPRL过程需要处理效能(链接质量)和隐私问题。例如，PPRL流程需要在数据源(例如，包含政府收入分配和援助计划的个人信息的数据库)上执行，并与实体进行准确的链接，同时保护信息的隐私性。因此，这项工作旨在简化PPRL过程，通过促进现实世界的应用程序(如医学、流行病学和人口研究)来减少访问和处理数据的法律和官僚努力，使这些应用程序的执行对公司和政府更直接。在此背景下，本工作对PPRL提出了两个主要贡献:i)通过使用机器学习技术来确定两个记录是否代表同一实体，从而提高了链接质量并简化了过程;以及ii)我们使在PPRL期间执行的计算具有可审计性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Anais Estendidos do XXXVI Simpósio Brasileiro de Banco de Dados (SBBD Estendido 2021)

自引率

0.00%

发文量