Budget Distributed Support Vector Machine for Non-ID Federated Learning Scenarios

ACM Transactions on Intelligent Systems and Technology (TIST) Pub Date : 2022-05-31 DOI:10.1145/3539734

Á. Navia-Vázquez, Roberto Díaz-Morales, Marcos Fernández Díaz

{"title":"Budget Distributed Support Vector Machine for Non-ID Federated Learning Scenarios","authors":"Á. Navia-Vázquez, Roberto Díaz-Morales, Marcos Fernández Díaz","doi":"10.1145/3539734","DOIUrl":null,"url":null,"abstract":"In recent years, there has been remarkable growth in Federated Learning (FL) approaches because they have proven to be very effective in training large Machine Learning (ML) models and also serve to preserve data confidentiality, as recommended by the GDPR or other business confidentiality restrictions that may apply. Despite the success of FL, performance is greatly reduced when data is not distributed identically (non-ID) across participants, as local model updates tend to diverge from the optimal global solution and thus the model averaging procedure in the aggregator is less effective. Kernel methods such as Support Vector Machines (SVMs) have not seen an equivalent evolution in the area of privacy preserving edge computing because they suffer from inherent computational, privacy and scalability issues. Furthermore, non-linear SVMs do not naturally lead to federated schemes, since locally trained models cannot be passed to the aggregator because they reveal training data (they are built on Support Vectors), and the global model cannot be updated at every worker using gradient descent. In this article, we explore the use of a particular controlled complexity (“Budget”) Distributed SVM (BDSVM) in the FL scenario with non-ID data, which is the least favorable situation, but very common in practice. The proposed BDSVM algorithm is as follows: model weights are broadcasted to workers, which locally update some kernel Gram matrices computed according to a common architectural base and send them back to the aggregator, which finally combines them, updates the global model, and repeats the procedure until a convergence criterion is met. Experimental results using synthetic 2D datasets show that the proposed method can obtain maximal margin decision boundaries even when the data is non-ID distributed. Further experiments using real-world datasets with non-ID data distribution show that the proposed algorithm provides better performance with less communication requirements than a comparable Multilayer Perceptron (MLP) trained using FedAvg. The advantage is more remarkable for a larger number of edge devices. We have also demonstrated the robustness of the proposed method against information leakage, membership inference attacks, and situations with dropout or straggler participants. Finally, in experiments run on separate processes/machines interconnected via the cloud messaging service developed in the context of the EU-H2020 MUSKETEER project, BDSVM is able to train better models than FedAvg in about half the time.","PeriodicalId":123526,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology (TIST)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Intelligent Systems and Technology (TIST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3539734","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

In recent years, there has been remarkable growth in Federated Learning (FL) approaches because they have proven to be very effective in training large Machine Learning (ML) models and also serve to preserve data confidentiality, as recommended by the GDPR or other business confidentiality restrictions that may apply. Despite the success of FL, performance is greatly reduced when data is not distributed identically (non-ID) across participants, as local model updates tend to diverge from the optimal global solution and thus the model averaging procedure in the aggregator is less effective. Kernel methods such as Support Vector Machines (SVMs) have not seen an equivalent evolution in the area of privacy preserving edge computing because they suffer from inherent computational, privacy and scalability issues. Furthermore, non-linear SVMs do not naturally lead to federated schemes, since locally trained models cannot be passed to the aggregator because they reveal training data (they are built on Support Vectors), and the global model cannot be updated at every worker using gradient descent. In this article, we explore the use of a particular controlled complexity (“Budget”) Distributed SVM (BDSVM) in the FL scenario with non-ID data, which is the least favorable situation, but very common in practice. The proposed BDSVM algorithm is as follows: model weights are broadcasted to workers, which locally update some kernel Gram matrices computed according to a common architectural base and send them back to the aggregator, which finally combines them, updates the global model, and repeats the procedure until a convergence criterion is met. Experimental results using synthetic 2D datasets show that the proposed method can obtain maximal margin decision boundaries even when the data is non-ID distributed. Further experiments using real-world datasets with non-ID data distribution show that the proposed algorithm provides better performance with less communication requirements than a comparable Multilayer Perceptron (MLP) trained using FedAvg. The advantage is more remarkable for a larger number of edge devices. We have also demonstrated the robustness of the proposed method against information leakage, membership inference attacks, and situations with dropout or straggler participants. Finally, in experiments run on separate processes/machines interconnected via the cloud messaging service developed in the context of the EU-H2020 MUSKETEER project, BDSVM is able to train better models than FedAvg in about half the time.

查看原文本刊更多论文

预算分布式支持向量机用于非id联邦学习场景

近年来，联邦学习(FL)方法有了显着的增长，因为它们已被证明在训练大型机器学习(ML)模型方面非常有效，并且还有助于保护数据机密性，正如GDPR或其他可能适用的商业机密限制所建议的那样。尽管FL取得了成功，但当数据在参与者之间的分布不相同(非id)时，性能会大大降低，因为局部模型更新倾向于偏离最优全局解决方案，因此聚合器中的模型平均过程效率较低。支持向量机(svm)等核方法在保护隐私的边缘计算领域还没有看到类似的发展，因为它们存在固有的计算、隐私和可扩展性问题。此外，非线性支持向量机不会自然地导致联邦方案，因为局部训练的模型不能传递给聚合器，因为它们会显示训练数据(它们建立在支持向量上)，并且全局模型不能使用梯度下降在每个工人上更新。在本文中，我们探索了在非id数据的FL场景中使用特定控制复杂度(“预算”)分布式支持向量机(BDSVM)，这是最不利的情况，但在实践中非常常见。提出的BDSVM算法是:将模型权值广播给工作器，工作器在局部更新一些根据共同架构基计算的核Gram矩阵，并将它们发送给聚合器，聚合器最终将它们组合起来，更新全局模型，并重复此过程，直到满足收敛准则。在二维合成数据集上的实验结果表明，即使是非id分布的数据，该方法也能获得最大裕度决策边界。使用非id数据分布的真实数据集进行的进一步实验表明，与使用fedag训练的可比多层感知器(MLP)相比，该算法提供了更好的性能和更少的通信需求。对于数量更多的边缘设备，这种优势更为显著。我们还证明了所提出的方法对信息泄漏，成员推理攻击以及辍学或掉队参与者的情况的鲁棒性。最后，在通过EU-H2020 MUSKETEER项目开发的云消息传递服务相互连接的独立进程/机器上运行的实验中，BDSVM能够在大约一半的时间内训练出比fedag更好的模型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ACM Transactions on Intelligent Systems and Technology (TIST)

自引率

0.00%

发文量