Towards More Efficient Data Valuation in Healthcare Federated Learning using Ensembling.

Sourav Kumar, A Lakshminarayanan, Ken Chang, Feri Guretno, Ivan Ho Mien, Jayashree Kalpathy-Cramer, Pavitra Krishnaswamy, Praveer Singh
{"title":"Towards More Efficient Data Valuation in Healthcare Federated Learning using Ensembling.","authors":"Sourav Kumar,&nbsp;A Lakshminarayanan,&nbsp;Ken Chang,&nbsp;Feri Guretno,&nbsp;Ivan Ho Mien,&nbsp;Jayashree Kalpathy-Cramer,&nbsp;Pavitra Krishnaswamy,&nbsp;Praveer Singh","doi":"10.1007/978-3-031-18523-6_12","DOIUrl":null,"url":null,"abstract":"<p><p>Federated Learning (FL) wherein multiple institutions collaboratively train a machine learning model without sharing data is becoming popular. Participating institutions might not contribute equally - some contribute more data, some better quality data or some more diverse data. To fairly rank the contribution of different institutions, Shapley value (SV) has emerged as the method of choice. Exact SV computation is impossibly expensive, especially when there are hundreds of contributors. Existing SV computation techniques use approximations. However, in healthcare where the number of contributing institutions are likely not of a colossal scale, computing exact SVs is still exorbitantly expensive, but not impossible. For such settings, we propose an efficient SV computation technique called SaFE (Shapley Value for Federated Learning using Ensembling). We empirically show that SaFE computes values that are close to exact SVs, and that it performs better than current SV approximations. This is particularly relevant in medical imaging setting where widespread heterogeneity across institutions is rampant and fast accurate data valuation is required to determine the contribution of each participant in multi-institutional collaborative learning.</p>","PeriodicalId":72833,"journal":{"name":"Distributed, collaborative, and federated learning, and affordable AI and healthcare for resource diverse global health : Third MICCAI Workshop, DeCaF 2022 and Second MICCAI Workshop, FAIR 2022, held in conjunction with MICCAI 2022, Sin...","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9890952/pdf/nihms-1859434.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Distributed, collaborative, and federated learning, and affordable AI and healthcare for resource diverse global health : Third MICCAI Workshop, DeCaF 2022 and Second MICCAI Workshop, FAIR 2022, held in conjunction with MICCAI 2022, Sin...","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/978-3-031-18523-6_12","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2022/10/7 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Federated Learning (FL) wherein multiple institutions collaboratively train a machine learning model without sharing data is becoming popular. Participating institutions might not contribute equally - some contribute more data, some better quality data or some more diverse data. To fairly rank the contribution of different institutions, Shapley value (SV) has emerged as the method of choice. Exact SV computation is impossibly expensive, especially when there are hundreds of contributors. Existing SV computation techniques use approximations. However, in healthcare where the number of contributing institutions are likely not of a colossal scale, computing exact SVs is still exorbitantly expensive, but not impossible. For such settings, we propose an efficient SV computation technique called SaFE (Shapley Value for Federated Learning using Ensembling). We empirically show that SaFE computes values that are close to exact SVs, and that it performs better than current SV approximations. This is particularly relevant in medical imaging setting where widespread heterogeneity across institutions is rampant and fast accurate data valuation is required to determine the contribution of each participant in multi-institutional collaborative learning.

使用Ensembling实现医疗保健联合学习中更高效的数据评估。
联合学习(FL)越来越流行,其中多个机构在不共享数据的情况下协作训练机器学习模型。参与机构的贡献可能不平等——有些机构贡献了更多的数据,有些机构贡献的数据质量更好,有些机构则贡献的数据更加多样化。为了公平地对不同机构的贡献进行排序,Shapley值(SV)已成为一种选择方法。精确的SV计算非常昂贵,尤其是在有数百个贡献者的情况下。现有的SV计算技术使用近似。然而,在医疗保健领域,贡献机构的数量可能不是很大,计算准确的SV仍然非常昂贵,但并非不可能。对于这种设置,我们提出了一种高效的SV计算技术,称为SaFE(使用Ensembling进行联合学习的Shapley值)。我们的经验表明,SaFE计算的值接近精确的SV,并且它的性能优于当前的SV近似。这在医学成像环境中尤其重要,在医学成像背景下,各机构之间普遍存在异质性,需要快速准确的数据评估来确定每个参与者在多机构协作学习中的贡献。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信