A differential privacy protection query language for medical data: a proof-of-concept system validation

Huanhuan Wang, Yongting Zhang, Hong-Li Yin, Ruirui Li, Xiang Wu
{"title":"A differential privacy protection query language for medical data: a proof-of-concept system validation","authors":"Huanhuan Wang, Yongting Zhang, Hong-Li Yin, Ruirui Li, Xiang Wu","doi":"10.1097/JBR.0000000000000099","DOIUrl":null,"url":null,"abstract":"Abstract Objective: Medical data mining and sharing is an important process in E-Health applications. However, because these data consist of a large amount of personal private information of patients, there is the risk of privacy disclosure when sharing and mining. Therefore, ensuring the security of medical big data in the process of publishing, sharing, and mining has become the focus of current research. The objective of our study is to design a framework based on a differential privacy protection mechanism to ensure the secure sharing of medical data. We developed a privacy protection query language (PQL) that integrates multiple data mining methods and provides a secure sharing function. Methods: This study is mainly performed in Xuzhou Medical University, China and designs three sub-modules: a parsing module, mining module, and noising module. Each module encapsulates different computing methods, such as a composite parser and a noise theory. In the PQL framework, we apply the differential privacy theory to the results of the computing between modules to guarantee the security of various mining algorithms. These computing devices operate independently, but the mining results depend on their cooperation. In addition, PQL is encapsulated in MNSSp3 that is a data mining and security sharing platform and the data comes from public data sets, such as UCBI. The public data set (NCBI database) was used as the experimental data, and the data collection time was January 2020. Results: We designed and developed a query language that provides functions for medical data mining, sharing, and privacy preservation. We theoretically proved the performance of the PQL framework. The experimental results show that the PQL framework can ensure the security of each mining result and the availability of the output results is above 97%. Conclusion: Our framework enables medical data providers to securely share health data or treatment data and develops a usable query language, based on a differential privacy mechanism, that enables researchers to mine information securely using data mining algorithms.","PeriodicalId":150904,"journal":{"name":"Journal of Bio-X Research","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2021-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Bio-X Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1097/JBR.0000000000000099","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Abstract Objective: Medical data mining and sharing is an important process in E-Health applications. However, because these data consist of a large amount of personal private information of patients, there is the risk of privacy disclosure when sharing and mining. Therefore, ensuring the security of medical big data in the process of publishing, sharing, and mining has become the focus of current research. The objective of our study is to design a framework based on a differential privacy protection mechanism to ensure the secure sharing of medical data. We developed a privacy protection query language (PQL) that integrates multiple data mining methods and provides a secure sharing function. Methods: This study is mainly performed in Xuzhou Medical University, China and designs three sub-modules: a parsing module, mining module, and noising module. Each module encapsulates different computing methods, such as a composite parser and a noise theory. In the PQL framework, we apply the differential privacy theory to the results of the computing between modules to guarantee the security of various mining algorithms. These computing devices operate independently, but the mining results depend on their cooperation. In addition, PQL is encapsulated in MNSSp3 that is a data mining and security sharing platform and the data comes from public data sets, such as UCBI. The public data set (NCBI database) was used as the experimental data, and the data collection time was January 2020. Results: We designed and developed a query language that provides functions for medical data mining, sharing, and privacy preservation. We theoretically proved the performance of the PQL framework. The experimental results show that the PQL framework can ensure the security of each mining result and the availability of the output results is above 97%. Conclusion: Our framework enables medical data providers to securely share health data or treatment data and develops a usable query language, based on a differential privacy mechanism, that enables researchers to mine information securely using data mining algorithms.
用于医疗数据的差分隐私保护查询语言:概念验证系统验证
摘要目的:医疗数据挖掘与共享是电子健康应用的重要环节。但由于这些数据包含了大量患者的个人隐私信息,在共享和挖掘时存在隐私泄露的风险。因此,保障医疗大数据在发布、共享、挖掘过程中的安全成为当前研究的重点。我们的研究目的是设计一个基于差分隐私保护机制的框架,以确保医疗数据的安全共享。我们开发了一种集成了多种数据挖掘方法并提供安全共享功能的隐私保护查询语言(PQL)。方法:本研究主要在中国徐州医科大学进行,设计了三个子模块:解析模块、挖掘模块和降噪模块。每个模块封装了不同的计算方法,如复合解析器和噪声理论。在PQL框架中,我们将差分隐私理论应用于模块之间的计算结果,以保证各种挖掘算法的安全性。这些计算设备独立运行,但挖掘结果取决于它们的合作。此外,PQL被封装在数据挖掘和安全共享平台MNSSp3中,数据来自公共数据集,如UCBI。实验数据采用公共数据集(NCBI数据库),数据采集时间为2020年1月。结果:我们设计并开发了一种查询语言,提供了医疗数据挖掘、共享和隐私保护功能。从理论上证明了PQL框架的性能。实验结果表明,PQL框架能够保证每个挖掘结果的安全性,输出结果的可用性在97%以上。结论:我们的框架使医疗数据提供者能够安全地共享健康数据或治疗数据,并基于差异隐私机制开发一种可用的查询语言,使研究人员能够使用数据挖掘算法安全地挖掘信息。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信