保隐私后随机化数据下的高效模型无关参数估计

IF 1 4区 数学 Q3 STATISTICS & PROBABILITY
Qinglong Tian, Jiwei Zhao
{"title":"保隐私后随机化数据下的高效模型无关参数估计","authors":"Qinglong Tian,&nbsp;Jiwei Zhao","doi":"10.1002/cjs.70003","DOIUrl":null,"url":null,"abstract":"<p>Balancing data privacy with public access is critical for sensitive datasets. However, even after de-identification, the data are still vulnerable to, for example, inference attacks (by matching some keywords with external datasets). Statistical disclosure control (SDC) methods offer additional protection, and the post-randomization method (PRAM) adds noise to data to achieve this goal. However, PRAM-perturbed data pose challenges for analysis, as directly using the perturbed data leads to biased parameter estimates. This article addresses parameter estimation when data are perturbed using PRAM for privacy. While existing methods suffer from limitations like being parameter-specific, model-dependent and lacking optimality guarantees, our proposed method overcomes these limitations. Our approach applies to general parameters defined through estimating equations and makes no assumptions about the underlying data model. Furthermore, we prove that the proposed estimator achieves the semiparametric efficiency bound, making it asymptotically optimal in terms of estimation efficiency.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"53 3","pages":""},"PeriodicalIF":1.0000,"publicationDate":"2025-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cjs.70003","citationCount":"0","resultStr":"{\"title\":\"Efficient and model-agnostic parameter estimation under privacy-preserving post-randomization data\",\"authors\":\"Qinglong Tian,&nbsp;Jiwei Zhao\",\"doi\":\"10.1002/cjs.70003\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Balancing data privacy with public access is critical for sensitive datasets. However, even after de-identification, the data are still vulnerable to, for example, inference attacks (by matching some keywords with external datasets). Statistical disclosure control (SDC) methods offer additional protection, and the post-randomization method (PRAM) adds noise to data to achieve this goal. However, PRAM-perturbed data pose challenges for analysis, as directly using the perturbed data leads to biased parameter estimates. This article addresses parameter estimation when data are perturbed using PRAM for privacy. While existing methods suffer from limitations like being parameter-specific, model-dependent and lacking optimality guarantees, our proposed method overcomes these limitations. Our approach applies to general parameters defined through estimating equations and makes no assumptions about the underlying data model. Furthermore, we prove that the proposed estimator achieves the semiparametric efficiency bound, making it asymptotically optimal in terms of estimation efficiency.</p>\",\"PeriodicalId\":55281,\"journal\":{\"name\":\"Canadian Journal of Statistics-Revue Canadienne De Statistique\",\"volume\":\"53 3\",\"pages\":\"\"},\"PeriodicalIF\":1.0000,\"publicationDate\":\"2025-04-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cjs.70003\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Canadian Journal of Statistics-Revue Canadienne De Statistique\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/cjs.70003\",\"RegionNum\":4,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"STATISTICS & PROBABILITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Canadian Journal of Statistics-Revue Canadienne De Statistique","FirstCategoryId":"100","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/cjs.70003","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 0

摘要

平衡数据隐私和公共访问对于敏感数据集至关重要。然而,即使在去识别之后,数据仍然容易受到例如推理攻击(通过将一些关键字与外部数据集匹配)。统计披露控制(SDC)方法提供了额外的保护,后随机化方法(PRAM)在数据中添加噪声以实现这一目标。然而,pram扰动数据给分析带来了挑战,因为直接使用扰动数据会导致参数估计有偏。本文讨论了使用PRAM对数据进行干扰时的参数估计。虽然现有方法存在诸如参数特定、模型依赖和缺乏最优性保证等局限性,但我们提出的方法克服了这些局限性。我们的方法适用于通过估计方程定义的一般参数,并且对底层数据模型不做任何假设。进一步证明了所提估计量达到了半参数效率界,使其在估计效率方面渐近最优。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Efficient and model-agnostic parameter estimation under privacy-preserving post-randomization data

Efficient and model-agnostic parameter estimation under privacy-preserving post-randomization data

Balancing data privacy with public access is critical for sensitive datasets. However, even after de-identification, the data are still vulnerable to, for example, inference attacks (by matching some keywords with external datasets). Statistical disclosure control (SDC) methods offer additional protection, and the post-randomization method (PRAM) adds noise to data to achieve this goal. However, PRAM-perturbed data pose challenges for analysis, as directly using the perturbed data leads to biased parameter estimates. This article addresses parameter estimation when data are perturbed using PRAM for privacy. While existing methods suffer from limitations like being parameter-specific, model-dependent and lacking optimality guarantees, our proposed method overcomes these limitations. Our approach applies to general parameters defined through estimating equations and makes no assumptions about the underlying data model. Furthermore, we prove that the proposed estimator achieves the semiparametric efficiency bound, making it asymptotically optimal in terms of estimation efficiency.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
1.40
自引率
0.00%
发文量
62
审稿时长
>12 weeks
期刊介绍: The Canadian Journal of Statistics is the official journal of the Statistical Society of Canada. It has a reputation internationally as an excellent journal. The editorial board is comprised of statistical scientists with applied, computational, methodological, theoretical and probabilistic interests. Their role is to ensure that the journal continues to provide an international forum for the discipline of Statistics. The journal seeks papers making broad points of interest to many readers, whereas papers making important points of more specific interest are better placed in more specialized journals. The levels of innovation and impact are key in the evaluation of submitted manuscripts.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信