Federated Tensor Factorization for Computational Phenotyping.

KDD : proceedings. International Conference on Knowledge Discovery & Data Mining Pub Date : 2017-08-01 DOI:10.1145/3097983.3098118

Yejin Kim, Jimeng Sun, Hwanjo Yu, Xiaoqian Jiang

{"title":"Federated Tensor Factorization for Computational Phenotyping.","authors":"Yejin Kim, Jimeng Sun, Hwanjo Yu, Xiaoqian Jiang","doi":"10.1145/3097983.3098118","DOIUrl":null,"url":null,"abstract":"<p><p>Tensor factorization models offer an effective approach to convert massive electronic health records into meaningful clinical concepts (phenotypes) for data analysis. These models need a large amount of diverse samples to avoid population bias. An open challenge is how to derive phenotypes jointly across multiple hospitals, in which direct patient-level data sharing is not possible (e.g., due to institutional policies). In this paper, we developed a novel solution to enable federated tensor factorization for computational phenotyping without sharing patient-level data. We developed secure data harmonization and federated computation procedures based on alternating direction method of multipliers (ADMM). Using this method, the multiple hospitals iteratively update tensors and transfer secure summarized information to a central server, and the server aggregates the information to generate phenotypes. We demonstrated with real medical datasets that our method resembles the centralized training model (based on combined datasets) in terms of accuracy and phenotypes discovery while respecting privacy.</p>","PeriodicalId":74037,"journal":{"name":"KDD : proceedings. International Conference on Knowledge Discovery & Data Mining","volume":"2017 ","pages":"887-895"},"PeriodicalIF":0.0000,"publicationDate":"2017-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5652331/pdf/nihms880922.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"KDD : proceedings. International Conference on Knowledge Discovery & Data Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3097983.3098118","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Tensor factorization models offer an effective approach to convert massive electronic health records into meaningful clinical concepts (phenotypes) for data analysis. These models need a large amount of diverse samples to avoid population bias. An open challenge is how to derive phenotypes jointly across multiple hospitals, in which direct patient-level data sharing is not possible (e.g., due to institutional policies). In this paper, we developed a novel solution to enable federated tensor factorization for computational phenotyping without sharing patient-level data. We developed secure data harmonization and federated computation procedures based on alternating direction method of multipliers (ADMM). Using this method, the multiple hospitals iteratively update tensors and transfer secure summarized information to a central server, and the server aggregates the information to generate phenotypes. We demonstrated with real medical datasets that our method resembles the centralized training model (based on combined datasets) in terms of accuracy and phenotypes discovery while respecting privacy.

Abstract Image

查看原文本刊更多论文

用于计算表型的联合张量因式分解。

张量因子化模型是将海量电子健康记录转换为有意义的临床概念（表型）进行数据分析的有效方法。这些模型需要大量不同的样本，以避免群体偏差。如何跨多家医院联合推导表型是一个公开的挑战，在这种情况下，直接的患者级数据共享是不可能的（例如，由于机构政策）。在本文中，我们开发了一种新颖的解决方案，在不共享患者级数据的情况下，为计算表型实现联合张量因子化。我们开发了基于交替方向乘法（ADMM）的安全数据协调和联合计算程序。利用这种方法，多家医院迭代更新张量并将安全汇总的信息传输到中央服务器，服务器汇总信息以生成表型。我们用真实的医疗数据集证明，我们的方法在准确性和表型发现方面与集中训练模型（基于合并数据集）相似，同时尊重隐私。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

KDD : proceedings. International Conference on Knowledge Discovery & Data Mining

自引率

0.00%

发文量