Anonymization of Health Insurance Claims Data for Medication Safety Assessments.

Studies in health technology and informatics Pub Date : 2025-09-03 DOI:10.3233/SHTI251407

Mehmed Halilovic, Karen Otte, Thierry Meurers, Marco Alibone, Marion Ludwig, Nico Riedel, Steven Wolter, Lisa Kühnel, Steffen Hess, Fabian Prasser

{"title":"Anonymization of Health Insurance Claims Data for Medication Safety Assessments.","authors":"Mehmed Halilovic, Karen Otte, Thierry Meurers, Marco Alibone, Marion Ludwig, Nico Riedel, Steven Wolter, Lisa Kühnel, Steffen Hess, Fabian Prasser","doi":"10.3233/SHTI251407","DOIUrl":null,"url":null,"abstract":"Introduction: The re-use of health insurance claims data for research purposes can provide valuable insights to improve patient care. However, as health data is often highly sensitive and subject to strict regulatory frameworks, the privacy of individuals must be protected. Anonymization is a common approach to do so, but finding an effective strategy is challenging due to an inherent trade-off between privacy protection and data utility. A structured approach is needed to balance these objectives and guide the selection of appropriate anonymization strategies.Methods: In this paper, we present a systematic evaluation of twelve anonymization strategies applied to German health insurance claims data that has previously been used in a drug safety study. The dataset consisted of 1727 records and 45 variables. Based on a structured threat modeling, we compare a conservative and a threat modeling-based approach, each with six different privacy models and risk thresholds using the ARX Data Anonymization Tool. We assess general data utility and empirically evaluate residual privacy risks using both the Anonymeter framework and a membership inference attack.Results: Our results show that conservative anonymization ensures strong privacy protection but reduces data utility. In contrast, threat modeling retains more utility while still providing acceptable privacy under moderate thresholds.Conclusion: The proposed process enables a systematic comparison of privacy-utility trade-offs and can be adapted to other medical datasets. Our findings highlight the importance of context-specific anonymization strategies and empirical risk evaluation to guide anonymized data sharing in healthcare.","PeriodicalId":94357,"journal":{"name":"Studies in health technology and informatics","volume":"331 ","pages":"283-291"},"PeriodicalIF":0.0000,"publicationDate":"2025-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Studies in health technology and informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3233/SHTI251407","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Introduction: The re-use of health insurance claims data for research purposes can provide valuable insights to improve patient care. However, as health data is often highly sensitive and subject to strict regulatory frameworks, the privacy of individuals must be protected. Anonymization is a common approach to do so, but finding an effective strategy is challenging due to an inherent trade-off between privacy protection and data utility. A structured approach is needed to balance these objectives and guide the selection of appropriate anonymization strategies.

Methods: In this paper, we present a systematic evaluation of twelve anonymization strategies applied to German health insurance claims data that has previously been used in a drug safety study. The dataset consisted of 1727 records and 45 variables. Based on a structured threat modeling, we compare a conservative and a threat modeling-based approach, each with six different privacy models and risk thresholds using the ARX Data Anonymization Tool. We assess general data utility and empirically evaluate residual privacy risks using both the Anonymeter framework and a membership inference attack.

Results: Our results show that conservative anonymization ensures strong privacy protection but reduces data utility. In contrast, threat modeling retains more utility while still providing acceptable privacy under moderate thresholds.

Conclusion: The proposed process enables a systematic comparison of privacy-utility trade-offs and can be adapted to other medical datasets. Our findings highlight the importance of context-specific anonymization strategies and empirical risk evaluation to guide anonymized data sharing in healthcare.

查看原文本刊更多论文

用于药物安全评估的健康保险索赔数据的匿名化。

简介：出于研究目的重用健康保险索赔数据可以为改善患者护理提供有价值的见解。然而，由于健康数据往往高度敏感，并受到严格监管框架的约束，因此必须保护个人隐私。匿名化是一种常见的方法，但由于隐私保护和数据效用之间的内在权衡，找到一种有效的策略是具有挑战性的。需要一种结构化的方法来平衡这些目标，并指导选择适当的匿名化策略。方法：在本文中，我们提出了12个匿名化策略应用于德国健康保险索赔数据的系统评估，这些数据以前曾用于药物安全研究。该数据集由1727条记录和45个变量组成。基于结构化的威胁建模，我们比较了保守和基于威胁建模的方法，每种方法使用ARX数据匿名化工具使用六种不同的隐私模型和风险阈值。我们评估一般数据效用和经验评估剩余隐私风险使用匿名框架和成员推理攻击。结果：我们的研究结果表明，保守匿名化确保了强大的隐私保护，但降低了数据效用。相比之下，威胁建模保留了更多的效用，同时仍然在中等阈值下提供可接受的隐私。结论：所提出的过程能够系统地比较隐私-效用权衡，并可适用于其他医疗数据集。我们的研究结果强调了上下文特定的匿名化策略和经验风险评估对指导医疗保健中的匿名数据共享的重要性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Studies in health technology and informatics

自引率

0.00%

发文量