Mohamad Arafeh, Ahmad Hammoud, H. Otrok, A. Mourad, C. Talhi, Z. Dziong
{"title":"Independent and Identically Distributed (IID) Data Assessment in Federated Learning","authors":"Mohamad Arafeh, Ahmad Hammoud, H. Otrok, A. Mourad, C. Talhi, Z. Dziong","doi":"10.1109/GLOBECOM48099.2022.10001718","DOIUrl":null,"url":null,"abstract":"Federated learning extends the centralized machine learning architecture by enabling data privacy for its providers. The distributed structure of the emerged federated architecture imposes a problem of the data being not independent and identically distributed (non-IID), which drastically affects the performance of the learning process. While the majority of the recent works in the federated learning domain have accepted this limitation, only a few scholars addressed the non-IID problem straightforwardly. Nevertheless, these works lack the fundamental analysis of the data’ IIDness, and/or contradict the privacy feature of the federated learning paradigm. In this paper, we focus on evaluating the harmony of the participants by studying their data distribution and calculating their level of compatibility. The devised tool, in this work, is an assessment technique integrated within the federated learning framework to analyze the data distribution among the trainers. Our proposed method is proven by experimenting with several scenarios, and results show that our utility can fairly assess the selected participants before initiating the learning process.","PeriodicalId":313199,"journal":{"name":"GLOBECOM 2022 - 2022 IEEE Global Communications Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"GLOBECOM 2022 - 2022 IEEE Global Communications Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/GLOBECOM48099.2022.10001718","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Federated learning extends the centralized machine learning architecture by enabling data privacy for its providers. The distributed structure of the emerged federated architecture imposes a problem of the data being not independent and identically distributed (non-IID), which drastically affects the performance of the learning process. While the majority of the recent works in the federated learning domain have accepted this limitation, only a few scholars addressed the non-IID problem straightforwardly. Nevertheless, these works lack the fundamental analysis of the data’ IIDness, and/or contradict the privacy feature of the federated learning paradigm. In this paper, we focus on evaluating the harmony of the participants by studying their data distribution and calculating their level of compatibility. The devised tool, in this work, is an assessment technique integrated within the federated learning framework to analyze the data distribution among the trainers. Our proposed method is proven by experimenting with several scenarios, and results show that our utility can fairly assess the selected participants before initiating the learning process.