Towards More Efficient Data Valuation in Healthcare Federated Learning using Ensembling.

Distributed, collaborative, and federated learning, and affordable AI and healthcare for resource diverse global health : Third MICCAI Workshop, DeCaF 2022 and Second MICCAI Workshop, FAIR 2022, held in conjunction with MICCAI 2022, Sin... Pub Date : 2022-09-01 Epub Date: 2022-10-07 DOI:10.1007/978-3-031-18523-6_12

Sourav Kumar, A Lakshminarayanan, Ken Chang, Feri Guretno, Ivan Ho Mien, Jayashree Kalpathy-Cramer, Pavitra Krishnaswamy, Praveer Singh

{"title":"Towards More Efficient Data Valuation in Healthcare Federated Learning using Ensembling.","authors":"Sourav Kumar, A Lakshminarayanan, Ken Chang, Feri Guretno, Ivan Ho Mien, Jayashree Kalpathy-Cramer, Pavitra Krishnaswamy, Praveer Singh","doi":"10.1007/978-3-031-18523-6_12","DOIUrl":null,"url":null,"abstract":"<p><p>Federated Learning (FL) wherein multiple institutions collaboratively train a machine learning model without sharing data is becoming popular. Participating institutions might not contribute equally - some contribute more data, some better quality data or some more diverse data. To fairly rank the contribution of different institutions, Shapley value (SV) has emerged as the method of choice. Exact SV computation is impossibly expensive, especially when there are hundreds of contributors. Existing SV computation techniques use approximations. However, in healthcare where the number of contributing institutions are likely not of a colossal scale, computing exact SVs is still exorbitantly expensive, but not impossible. For such settings, we propose an efficient SV computation technique called SaFE (Shapley Value for Federated Learning using Ensembling). We empirically show that SaFE computes values that are close to exact SVs, and that it performs better than current SV approximations. This is particularly relevant in medical imaging setting where widespread heterogeneity across institutions is rampant and fast accurate data valuation is required to determine the contribution of each participant in multi-institutional collaborative learning.</p>","PeriodicalId":72833,"journal":{"name":"Distributed, collaborative, and federated learning, and affordable AI and healthcare for resource diverse global health : Third MICCAI Workshop, DeCaF 2022 and Second MICCAI Workshop, FAIR 2022, held in conjunction with MICCAI 2022, Sin...","volume":"13573 ","pages":"119-129"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9890952/pdf/nihms-1859434.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Distributed, collaborative, and federated learning, and affordable AI and healthcare for resource diverse global health : Third MICCAI Workshop, DeCaF 2022 and Second MICCAI Workshop, FAIR 2022, held in conjunction with MICCAI 2022, Sin...","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/978-3-031-18523-6_12","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2022/10/7 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Federated Learning (FL) wherein multiple institutions collaboratively train a machine learning model without sharing data is becoming popular. Participating institutions might not contribute equally - some contribute more data, some better quality data or some more diverse data. To fairly rank the contribution of different institutions, Shapley value (SV) has emerged as the method of choice. Exact SV computation is impossibly expensive, especially when there are hundreds of contributors. Existing SV computation techniques use approximations. However, in healthcare where the number of contributing institutions are likely not of a colossal scale, computing exact SVs is still exorbitantly expensive, but not impossible. For such settings, we propose an efficient SV computation technique called SaFE (Shapley Value for Federated Learning using Ensembling). We empirically show that SaFE computes values that are close to exact SVs, and that it performs better than current SV approximations. This is particularly relevant in medical imaging setting where widespread heterogeneity across institutions is rampant and fast accurate data valuation is required to determine the contribution of each participant in multi-institutional collaborative learning.

查看原文本刊更多论文

使用Ensembling实现医疗保健联合学习中更高效的数据评估。

联合学习（FL）越来越流行，其中多个机构在不共享数据的情况下协作训练机器学习模型。参与机构的贡献可能不平等——有些机构贡献了更多的数据，有些机构贡献的数据质量更好，有些机构则贡献的数据更加多样化。为了公平地对不同机构的贡献进行排序，Shapley值（SV）已成为一种选择方法。精确的SV计算非常昂贵，尤其是在有数百个贡献者的情况下。现有的SV计算技术使用近似。然而，在医疗保健领域，贡献机构的数量可能不是很大，计算准确的SV仍然非常昂贵，但并非不可能。对于这种设置，我们提出了一种高效的SV计算技术，称为SaFE（使用Ensembling进行联合学习的Shapley值）。我们的经验表明，SaFE计算的值接近精确的SV，并且它的性能优于当前的SV近似。这在医学成像环境中尤其重要，在医学成像背景下，各机构之间普遍存在异质性，需要快速准确的数据评估来确定每个参与者在多机构协作学习中的贡献。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Distributed, collaborative, and federated learning, and affordable AI and healthcare for resource diverse global health : Third MICCAI Workshop, DeCaF 2022 and Second MICCAI Workshop, FAIR 2022, held in conjunction with MICCAI 2022, Sin...

自引率

0.00%

发文量