Performance vs. Privacy: Evaluating the Performance of Predicting Second Primary Cancer in Lung Cancer Survivors with Privacy-preserving Approaches

2022 IEEE-EMBS International Conference on Biomedical and Health Informatics (BHI) Pub Date : 2022-09-27 DOI:10.1109/BHI56158.2022.9926935

Jui-Fu Hong, Y. Tseng

{"title":"Performance vs. Privacy: Evaluating the Performance of Predicting Second Primary Cancer in Lung Cancer Survivors with Privacy-preserving Approaches","authors":"Jui-Fu Hong, Y. Tseng","doi":"10.1109/BHI56158.2022.9926935","DOIUrl":null,"url":null,"abstract":"Deep learning has been widely used in the medical field to support medical decision making. Simultaneously, with the rise of data privacy protection, accessing clinical records across different institutions has become a possible challenge. Several approaches, such as federated and transfer learning, have been proposed to train models without accessing all the records from each institution, but the performance of these privacy-preserved models may not be as good as centralized approaches, which aggregate all records to build a centralized model. To explore the potential of privacy-preserving second primary cancer (SPC) prediction of lung cancer survivors using real-world data, we evaluated the performance of federated learning, transfer learning, learning with a single institution, and traditional centralized learning. We trained machine learning models using data from four hospitals and compared the model performances of learning from a single institution, centralized learning, federated learning, and transfer learning. The results show that federated learning outperformed other learning strategies in three of the four sites (AUROC from 0.733 to 0.777). However, only Site 6 showed that federated learning significantly outperformed all the other learning strategies (P < 0.05). In summary, federated learning can develop a unified model for the multiple institutions while maintaining data security.","PeriodicalId":347210,"journal":{"name":"2022 IEEE-EMBS International Conference on Biomedical and Health Informatics (BHI)","volume":"146 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE-EMBS International Conference on Biomedical and Health Informatics (BHI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BHI56158.2022.9926935","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Deep learning has been widely used in the medical field to support medical decision making. Simultaneously, with the rise of data privacy protection, accessing clinical records across different institutions has become a possible challenge. Several approaches, such as federated and transfer learning, have been proposed to train models without accessing all the records from each institution, but the performance of these privacy-preserved models may not be as good as centralized approaches, which aggregate all records to build a centralized model. To explore the potential of privacy-preserving second primary cancer (SPC) prediction of lung cancer survivors using real-world data, we evaluated the performance of federated learning, transfer learning, learning with a single institution, and traditional centralized learning. We trained machine learning models using data from four hospitals and compared the model performances of learning from a single institution, centralized learning, federated learning, and transfer learning. The results show that federated learning outperformed other learning strategies in three of the four sites (AUROC from 0.733 to 0.777). However, only Site 6 showed that federated learning significantly outperformed all the other learning strategies (P < 0.05). In summary, federated learning can develop a unified model for the multiple institutions while maintaining data security.

查看原文本刊更多论文

性能与隐私:评估使用隐私保护方法预测肺癌幸存者第二原发性癌症的性能

深度学习已被广泛应用于医疗领域，以支持医疗决策。同时，随着数据隐私保护的兴起，访问不同机构的临床记录已成为一个可能的挑战。已经提出了几种方法，如联邦学习和迁移学习，在不访问每个机构的所有记录的情况下训练模型，但是这些保护隐私的模型的性能可能不如集中方法好，集中方法将所有记录聚集在一起以构建集中模型。为了探索使用真实世界数据对肺癌幸存者进行隐私保护的第二原发癌(SPC)预测的潜力，我们评估了联邦学习、迁移学习、单一机构学习和传统集中式学习的性能。我们使用来自四家医院的数据训练机器学习模型，并比较了从单一机构学习、集中学习、联合学习和迁移学习的模型性能。结果表明，联邦学习在四个站点中的三个站点上优于其他学习策略(AUROC从0.733到0.777)。然而，只有Site 6显示联邦学习显著优于其他所有学习策略(P < 0.05)。总之，联邦学习可以在维护数据安全的同时为多个机构开发统一的模型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 IEEE-EMBS International Conference on Biomedical and Health Informatics (BHI)

自引率

0.00%

发文量