Privacy Preserving Distributed Learning Clustering of HealthCare Data Using Cryptography Protocols

Ahmed M. Elmisery, Huaiguo Fu
{"title":"Privacy Preserving Distributed Learning Clustering of HealthCare Data Using Cryptography Protocols","authors":"Ahmed M. Elmisery, Huaiguo Fu","doi":"10.1109/COMPSACW.2010.33","DOIUrl":null,"url":null,"abstract":"Data mining is the process of knowledge discovery in databases (centralized or distributed); it consists of different tasks associated with them different algorithms. Nowadays the scenario of one centralized database that maintains all the data is difficult to achieve due to different reasons including physical, geographical restrictions and size of the data itself. One approach to solve this problem is distributed databases where different parities have horizontal or vertical partitions of the data. The data is normally maintained by more than one organization, each of which aims at keeping its information stored in the databases private, thus, privacy-preserving techniques and protocols are designed to perform data mining on distributed data when privacy is highly concerned. Cluster analysis is a frequently used data mining task which aims at decomposing or partitioning a usually multivariate data set into groups such that the data objects in one group are the most similar to each other. It has an important role in different fields such as bio-informatics, marketing, machine learning, limate and healthcare. In this paper we introduce a novel clustering algorithm that was designed with the goal of enabling a privacy preserving version of it, along with sub-protocols for secure computations, to handle the clustering of vertically partitioned data among different healthcare data providers.","PeriodicalId":121135,"journal":{"name":"2010 IEEE 34th Annual Computer Software and Applications Conference Workshops","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"38","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 IEEE 34th Annual Computer Software and Applications Conference Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/COMPSACW.2010.33","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 38

Abstract

Data mining is the process of knowledge discovery in databases (centralized or distributed); it consists of different tasks associated with them different algorithms. Nowadays the scenario of one centralized database that maintains all the data is difficult to achieve due to different reasons including physical, geographical restrictions and size of the data itself. One approach to solve this problem is distributed databases where different parities have horizontal or vertical partitions of the data. The data is normally maintained by more than one organization, each of which aims at keeping its information stored in the databases private, thus, privacy-preserving techniques and protocols are designed to perform data mining on distributed data when privacy is highly concerned. Cluster analysis is a frequently used data mining task which aims at decomposing or partitioning a usually multivariate data set into groups such that the data objects in one group are the most similar to each other. It has an important role in different fields such as bio-informatics, marketing, machine learning, limate and healthcare. In this paper we introduce a novel clustering algorithm that was designed with the goal of enabling a privacy preserving version of it, along with sub-protocols for secure computations, to handle the clustering of vertically partitioned data among different healthcare data providers.
使用加密协议的医疗保健数据的隐私保护分布式学习聚类
数据挖掘是在数据库(集中式或分布式)中发现知识的过程;它由不同的任务和不同的算法组成。由于物理、地理限制和数据本身的大小等原因,目前很难实现一个集中式数据库维护所有数据的场景。解决这个问题的一种方法是分布式数据库,其中不同的对等体具有数据的水平或垂直分区。数据通常由多个组织维护,每个组织都旨在保持其存储在数据库中的信息的私密性,因此,隐私保护技术和协议被设计用于在高度关注隐私的情况下对分布式数据进行数据挖掘。聚类分析是一种常用的数据挖掘任务,其目的是将通常是多变量的数据集分解或划分为组,使一组中的数据对象彼此最相似。它在生物信息学、市场营销、机器学习、气候和医疗保健等不同领域发挥着重要作用。在本文中,我们介绍了一种新的聚类算法,其设计目标是启用其隐私保护版本,以及用于安全计算的子协议,以处理不同医疗保健数据提供者之间垂直分区数据的聚类。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信