Protecting privacy in tabular healthcare data: explicit uncertainty for disclosure control

B. Shand, J. Rashbass
{"title":"Protecting privacy in tabular healthcare data: explicit uncertainty for disclosure control","authors":"B. Shand, J. Rashbass","doi":"10.1145/1102199.1102203","DOIUrl":null,"url":null,"abstract":"Summary medical data provides important statistical information for public health, but risks revealing confidential patient information. This risk is particularly difficult to assess when many different tables are released, independently protected against disclosure by various techniques. In this paper, we present a new technique for disclosure control in tabular data which uses explicit uncertainty to prevent small numbers of records from being identified disclosively. In contrast to other techniques, bounds on the cell perturbations are also made public. This technique can be applied effectively to large datasets in their entirety, automatically, and the transformed data can then be used to create the derivative tables, or hosted on a public web site. It is even safe for population-based data. Furthermore, we show that this transformation is computationally efficient while ensuring k-anonymity, and demonstrate the suitability of the transformed data for further statistical analysis.","PeriodicalId":74537,"journal":{"name":"Proceedings of the ACM Workshop on Privacy in the Electronic Society. ACM Workshop on Privacy in the Electronic Society","volume":"112 1","pages":"20-26"},"PeriodicalIF":0.0000,"publicationDate":"2005-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ACM Workshop on Privacy in the Electronic Society. ACM Workshop on Privacy in the Electronic Society","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1102199.1102203","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

Summary medical data provides important statistical information for public health, but risks revealing confidential patient information. This risk is particularly difficult to assess when many different tables are released, independently protected against disclosure by various techniques. In this paper, we present a new technique for disclosure control in tabular data which uses explicit uncertainty to prevent small numbers of records from being identified disclosively. In contrast to other techniques, bounds on the cell perturbations are also made public. This technique can be applied effectively to large datasets in their entirety, automatically, and the transformed data can then be used to create the derivative tables, or hosted on a public web site. It is even safe for population-based data. Furthermore, we show that this transformation is computationally efficient while ensuring k-anonymity, and demonstrate the suitability of the transformed data for further statistical analysis.
保护表格式医疗保健数据中的隐私:披露控制的明确不确定性
摘要医疗数据为公共卫生提供了重要的统计信息,但有泄露患者机密信息的风险。当许多不同的表被发布时,这种风险特别难以评估,这些表被各种技术独立地保护以防止泄露。本文提出了一种新的表格数据公开控制技术,该技术利用显式不确定性来防止少量记录被公开识别。与其他技术相比,细胞扰动的边界也是公开的。这种技术可以有效地、自动地应用于完整的大型数据集,转换后的数据可以用来创建派生表,或者托管在公共网站上。它甚至对基于人口的数据也是安全的。此外,我们证明了这种转换在确保k-匿名性的同时具有计算效率,并证明了转换后的数据对进一步统计分析的适用性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信