Ensemble Federated Learning for Non-II D COVID-19 Detection

2022 5th International Conference on Computing and Informatics (ICCI) Pub Date : 2022-03-09 DOI:10.1109/icci54321.2022.9756090

Khaled M. Elshabrawy, Mayar M. Alfares, Mohammed Abdel-Megeed Salem

{"title":"Ensemble Federated Learning for Non-II D COVID-19 Detection","authors":"Khaled M. Elshabrawy, Mayar M. Alfares, Mohammed Abdel-Megeed Salem","doi":"10.1109/icci54321.2022.9756090","DOIUrl":null,"url":null,"abstract":"In light of the COVID-19 pandemic, the need for a chest X-ray scans classifier is crucial in order to diagnose patients and classify scans into normal, COVID-infected, and pneumonia. Federated learning was chosen for the classification as it uses a decentralized approach to train the model at the local servers belonging to each entity in various geographic locations. Therefore, information leakage that could happen from the traditional centralized approach of training is prevented, besides saving the huge cost of central storage. However, between the vast difference in the number of X-ray scans per data-silo (i.e. hospital), the dissimilar image-acquisition techniques, and the diverse morphological structures of the human chest, non-IID (non-Independent and Identically Distributed) skews are introduced in the data. In this paper, real-world datasets of COVID and pneumonia scans are used to satisfy all the non-IID data skews. An experiment was then conducted to test the effect of these skews using five federated learning algorithms, FedAvg, FedProx, FedNova, SCAFFOLD, and FedBN, under the same metrics. The obtained accuracy values are 79.5%, 76.92%, 5.57%, 79.18%, and 84.4%, respectively. In this paper, we present the different effects of non-IID skews on the training process and discuss the different federated learning variations to mitigate the data heterogeneity.","PeriodicalId":122550,"journal":{"name":"2022 5th International Conference on Computing and Informatics (ICCI)","volume":"116 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 5th International Conference on Computing and Informatics (ICCI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/icci54321.2022.9756090","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

In light of the COVID-19 pandemic, the need for a chest X-ray scans classifier is crucial in order to diagnose patients and classify scans into normal, COVID-infected, and pneumonia. Federated learning was chosen for the classification as it uses a decentralized approach to train the model at the local servers belonging to each entity in various geographic locations. Therefore, information leakage that could happen from the traditional centralized approach of training is prevented, besides saving the huge cost of central storage. However, between the vast difference in the number of X-ray scans per data-silo (i.e. hospital), the dissimilar image-acquisition techniques, and the diverse morphological structures of the human chest, non-IID (non-Independent and Identically Distributed) skews are introduced in the data. In this paper, real-world datasets of COVID and pneumonia scans are used to satisfy all the non-IID data skews. An experiment was then conducted to test the effect of these skews using five federated learning algorithms, FedAvg, FedProx, FedNova, SCAFFOLD, and FedBN, under the same metrics. The obtained accuracy values are 79.5%, 76.92%, 5.57%, 79.18%, and 84.4%, respectively. In this paper, we present the different effects of non-IID skews on the training process and discuss the different federated learning variations to mitigate the data heterogeneity.

查看原文本刊更多论文

基于集成联邦学习的非ii型COVID-19检测

鉴于2019冠状病毒病大流行，对胸部x射线扫描分类器的需求至关重要，以便对患者进行诊断并将扫描分为正常、COVID-19感染和肺炎。选择联邦学习进行分类是因为它使用分散的方法在属于不同地理位置的每个实体的本地服务器上训练模型。这样既避免了传统的集中培训方式可能造成的信息泄露，又节省了大量的集中存储成本。然而，由于每个数据仓库(即医院)x射线扫描次数的巨大差异、不同的图像采集技术以及人类胸部的不同形态结构，数据中引入了非iid(非独立和同分布)偏差。本文使用COVID和肺炎扫描的真实数据集来满足所有非iid数据偏差。然后进行了一个实验，使用五种联邦学习算法(FedAvg、FedProx、FedNova、SCAFFOLD和FedBN)在相同的度量下测试这些偏差的影响。得到的准确率分别为79.5%、76.92%、5.57%、79.18%和84.4%。在本文中，我们提出了非iid偏差对训练过程的不同影响，并讨论了不同的联邦学习变量来减轻数据异质性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 5th International Conference on Computing and Informatics (ICCI)

自引率

0.00%

发文量