Towards Federated Learning with Attention Transfer to Mitigate System and Data Heterogeneity of Clients

Proceedings of the 4th International Workshop on Edge Systems, Analytics and Networking Pub Date : 2021-04-26 DOI:10.1145/3434770.3459739

Hongrui Shi, Valentin Radu

{"title":"Towards Federated Learning with Attention Transfer to Mitigate System and Data Heterogeneity of Clients","authors":"Hongrui Shi, Valentin Radu","doi":"10.1145/3434770.3459739","DOIUrl":null,"url":null,"abstract":"Federated learning is a method of training a global model on the private data of many devices. With a growing spectrum of devices, some slower than smartphones, such as IoT devices, and others faster, such as home data boxes, the standard Federated Learning (FL) method of distributing the same model to all clients is starting to break down-- slow clients inevitably become strugglers. We propose a FL approach that spores different size models, each matching the computational capacity of the client system. There is still a global model, but for the edge tasks, the server trains different size student models with attention transfer, each chosen for a target client. This allows clients to perform enough local updates and still meet the round cut-off time. Client models are used as the source of attention transfer after their local update, to refine the global model on the server. We evaluate our approach on non-IID data to find that attention transfer can be paired with training on metadata brought from the client side to boost the performance of the server model even on previously unseen classes. Our FL with attention transfer opens the opportunity for smaller devices to be included in the Federated Learning training rounds and to integrate even more extreme data distributions.","PeriodicalId":389020,"journal":{"name":"Proceedings of the 4th International Workshop on Edge Systems, Analytics and Networking","volume":"47 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 4th International Workshop on Edge Systems, Analytics and Networking","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3434770.3459739","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 7

Abstract

Federated learning is a method of training a global model on the private data of many devices. With a growing spectrum of devices, some slower than smartphones, such as IoT devices, and others faster, such as home data boxes, the standard Federated Learning (FL) method of distributing the same model to all clients is starting to break down-- slow clients inevitably become strugglers. We propose a FL approach that spores different size models, each matching the computational capacity of the client system. There is still a global model, but for the edge tasks, the server trains different size student models with attention transfer, each chosen for a target client. This allows clients to perform enough local updates and still meet the round cut-off time. Client models are used as the source of attention transfer after their local update, to refine the global model on the server. We evaluate our approach on non-IID data to find that attention transfer can be paired with training on metadata brought from the client side to boost the performance of the server model even on previously unseen classes. Our FL with attention transfer opens the opportunity for smaller devices to be included in the Federated Learning training rounds and to integrate even more extreme data distributions.

查看原文本刊更多论文

基于注意力转移的联邦学习缓解客户端系统和数据异构性

联邦学习是一种在许多设备的私有数据上训练全局模型的方法。随着设备种类的不断增加，有些比智能手机慢，比如物联网设备，有些更快，比如家庭数据盒，将相同模型分发给所有客户端的标准联邦学习(FL)方法开始崩溃——速度慢的客户端不可避免地会成为挣扎者。我们提出了一种孢子不同大小的模型的FL方法，每个模型都匹配客户端系统的计算能力。仍然有一个全局模型，但是对于边缘任务，服务器通过注意力转移训练不同大小的学生模型，每个模型都为目标客户端选择。这允许客户端执行足够的本地更新，并且仍然满足循环截止时间。客户端模型被用作本地更新后的注意力转移源，以改进服务器上的全局模型。我们在非iid数据上评估了我们的方法，发现注意力转移可以与来自客户端的元数据训练相结合，以提高服务器模型的性能，即使是在以前看不见的类上。我们的具有注意力转移的FL为小型设备提供了机会，可以将其包含在联邦学习训练轮中，并集成更极端的数据分布。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 4th International Workshop on Edge Systems, Analytics and Networking

自引率

0.00%

发文量