PanDa Game：基于博弈论模型的个人级流行病数据的优化隐私保护发布。

IF 3.7 4区生物学 Q1 BIOCHEMICAL RESEARCH METHODS

IEEE Transactions on NanoBioscience Pub Date : 2023-06-08 DOI:10.1109/TNB.2023.3284092

Abinitha Gourabathina;Zhiyu Wan;J. Thomas Brown;Chao Yan;Bradley A. Malin

{"title":"PanDa Game：基于博弈论模型的个人级流行病数据的优化隐私保护发布。","authors":"Abinitha Gourabathina;Zhiyu Wan;J. Thomas Brown;Chao Yan;Bradley A. Malin","doi":"10.1109/TNB.2023.3284092","DOIUrl":null,"url":null,"abstract":"Sharing individual-level pandemic data is essential for accelerating the understanding of a disease. For example, COVID-19 data have been widely collected to support public health surveillance and research. In the United States, these data are typically de-identified before publication to protect the privacy of the corresponding individuals. However, current data publishing approaches for this type of data, such as those adopted by the U.S. Centers for Disease Control and Prevention (CDC), have not flexed over time to account for the dynamic nature of infection rates. Thus, the policies generated by these strategies have the potential to both raise privacy risks or overprotect the data and impair the data utility (or usability). To optimize the tradeoff between privacy risk and data utility, we introduce a game theoretic model that adaptively generates policies for the publication of individual-level COVID-19 data according to infection dynamics. We model the data publishing process as a two-player Stackelberg game between a data publisher and a data recipient and then search for the best strategy for the publisher. In this game, we consider 1) average performance of predicting future case counts; and 2) mutual information between the original data and the released data. We use COVID-19 case data from Vanderbilt University Medical Center from March 2020 to December 2021 to demonstrate the effectiveness of the new model. The results indicate that the game theoretic model outperforms all state-of-the-art baseline approaches, including those adopted by CDC, while maintaining low privacy risk. We further perform an extensive sensitivity analyses to show that our findings are robust to order-of-magnitude parameter fluctuations.","PeriodicalId":13264,"journal":{"name":"IEEE Transactions on NanoBioscience","volume":"22 4","pages":"808-817"},"PeriodicalIF":3.7000,"publicationDate":"2023-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"PanDa Game: Optimized Privacy-Preserving Publishing of Individual-Level Pandemic Data Based on a Game Theoretic Model\",\"authors\":\"Abinitha Gourabathina;Zhiyu Wan;J. Thomas Brown;Chao Yan;Bradley A. Malin\",\"doi\":\"10.1109/TNB.2023.3284092\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Sharing individual-level pandemic data is essential for accelerating the understanding of a disease. For example, COVID-19 data have been widely collected to support public health surveillance and research. In the United States, these data are typically de-identified before publication to protect the privacy of the corresponding individuals. However, current data publishing approaches for this type of data, such as those adopted by the U.S. Centers for Disease Control and Prevention (CDC), have not flexed over time to account for the dynamic nature of infection rates. Thus, the policies generated by these strategies have the potential to both raise privacy risks or overprotect the data and impair the data utility (or usability). To optimize the tradeoff between privacy risk and data utility, we introduce a game theoretic model that adaptively generates policies for the publication of individual-level COVID-19 data according to infection dynamics. We model the data publishing process as a two-player Stackelberg game between a data publisher and a data recipient and then search for the best strategy for the publisher. In this game, we consider 1) average performance of predicting future case counts; and 2) mutual information between the original data and the released data. We use COVID-19 case data from Vanderbilt University Medical Center from March 2020 to December 2021 to demonstrate the effectiveness of the new model. The results indicate that the game theoretic model outperforms all state-of-the-art baseline approaches, including those adopted by CDC, while maintaining low privacy risk. We further perform an extensive sensitivity analyses to show that our findings are robust to order-of-magnitude parameter fluctuations.\",\"PeriodicalId\":13264,\"journal\":{\"name\":\"IEEE Transactions on NanoBioscience\",\"volume\":\"22 4\",\"pages\":\"808-817\"},\"PeriodicalIF\":3.7000,\"publicationDate\":\"2023-06-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on NanoBioscience\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10146322/\",\"RegionNum\":4,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on NanoBioscience","FirstCategoryId":"99","ListUrlMain":"https://ieeexplore.ieee.org/document/10146322/","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}

引用次数: 0

摘要

共享个人层面的流行病数据对于加快对疾病的理解至关重要。例如，新冠肺炎数据已被广泛收集，以支持公共卫生监测和研究。在美国，为了保护相应个人的隐私，这些数据通常在发布前被取消标识。然而，目前这类数据的数据发布方法，如美国疾病控制与预防中心（CDC）采用的方法，并没有随着时间的推移而改变，以考虑感染率的动态性质。因此，这些策略产生的策略有可能增加隐私风险或过度保护数据，并损害数据的实用性（或可用性）。为了优化隐私风险和数据效用之间的权衡，我们引入了一个博弈论模型，该模型根据感染动态自适应地生成用于发布个人级别新冠肺炎数据的策略。我们将数据发布过程建模为数据发布者和数据接收者之间的两人Stackelberg游戏，然后搜索发布者的最佳策略。在这个游戏中，我们考虑1）预测未来病例数的平均性能；以及2）原始数据和发布数据之间的相互信息。我们使用范德比尔特大学医学中心2020年3月至2021年12月的新冠肺炎病例数据来证明新模型的有效性。结果表明，博弈论模型在保持低隐私风险的同时，优于所有最先进的基线方法，包括美国疾病控制与预防中心采用的方法。我们进一步进行了广泛的敏感性分析，以表明我们的发现对数量级参数波动是稳健的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

PanDa Game: Optimized Privacy-Preserving Publishing of Individual-Level Pandemic Data Based on a Game Theoretic Model

Sharing individual-level pandemic data is essential for accelerating the understanding of a disease. For example, COVID-19 data have been widely collected to support public health surveillance and research. In the United States, these data are typically de-identified before publication to protect the privacy of the corresponding individuals. However, current data publishing approaches for this type of data, such as those adopted by the U.S. Centers for Disease Control and Prevention (CDC), have not flexed over time to account for the dynamic nature of infection rates. Thus, the policies generated by these strategies have the potential to both raise privacy risks or overprotect the data and impair the data utility (or usability). To optimize the tradeoff between privacy risk and data utility, we introduce a game theoretic model that adaptively generates policies for the publication of individual-level COVID-19 data according to infection dynamics. We model the data publishing process as a two-player Stackelberg game between a data publisher and a data recipient and then search for the best strategy for the publisher. In this game, we consider 1) average performance of predicting future case counts; and 2) mutual information between the original data and the released data. We use COVID-19 case data from Vanderbilt University Medical Center from March 2020 to December 2021 to demonstrate the effectiveness of the new model. The results indicate that the game theoretic model outperforms all state-of-the-art baseline approaches, including those adopted by CDC, while maintaining low privacy risk. We further perform an extensive sensitivity analyses to show that our findings are robust to order-of-magnitude parameter fluctuations.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on NanoBioscience 工程技术-纳米科技

CiteScore

7.00

自引率

5.10%

发文量

197

审稿时长

>12 weeks

期刊介绍： The IEEE Transactions on NanoBioscience reports on original, innovative and interdisciplinary work on all aspects of molecular systems, cellular systems, and tissues (including molecular electronics). Topics covered in the journal focus on a broad spectrum of aspects, both on foundations and on applications. Specifically, methods and techniques, experimental aspects, design and implementation, instrumentation and laboratory equipment, clinical aspects, hardware and software data acquisition and analysis and computer based modelling are covered (based on traditional or high performance computing - parallel computers or computer networks).