联邦学习中基于设备相似性的加速训练

Proceedings of the 4th International Workshop on Edge Systems, Analytics and Networking Pub Date : 2021-04-26 DOI:10.1145/3434770.3459734

Yuanli Wang, Joel Wolfrath, N. Sreekumar, Dhruv Kumar, A. Chandra

{"title":"联邦学习中基于设备相似性的加速训练","authors":"Yuanli Wang, Joel Wolfrath, N. Sreekumar, Dhruv Kumar, A. Chandra","doi":"10.1145/3434770.3459734","DOIUrl":null,"url":null,"abstract":"Federated Learning is a privacy-preserving, machine learning technique that generates a globally shared model with in-situ model training on distributed devices. These systems are often comprised of millions of user devices and only a subset of available devices can be used for training in each epoch. Designing a device selection strategy is challenging, given that devices are highly heterogeneous in both their system resources and training data. This heterogeneity makes device selection very crucial for timely model convergence and sufficient model accuracy. Existing approaches have addressed system heterogeneity for device selection but have largely ignored the data heterogeneity. In this work, we analyze the impact of data heterogeneity on device selection, model convergence, model accuracy, and fault tolerance in a federated learning setting. Based on our analysis, we propose that clustering devices with similar data distributions followed by selecting the devices with the best processing capacity from each cluster can significantly improve the model convergence without compromising model accuracy. This clustering also guides us in designing policies for fault tolerance in the system. We propose three methods for identifying groups of devices with similar data distributions. We also identify and discuss rich trade-offs between privacy, bandwidth consumption, and computation overhead for each of these proposed methods. Our preliminary experiments show that the proposed methods can provide a 46% - 58% reduction in training time compared to existing approaches in reaching the same accuracy.","PeriodicalId":389020,"journal":{"name":"Proceedings of the 4th International Workshop on Edge Systems, Analytics and Networking","volume":"131 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"Accelerated Training via Device Similarity in Federated Learning\",\"authors\":\"Yuanli Wang, Joel Wolfrath, N. Sreekumar, Dhruv Kumar, A. Chandra\",\"doi\":\"10.1145/3434770.3459734\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Federated Learning is a privacy-preserving, machine learning technique that generates a globally shared model with in-situ model training on distributed devices. These systems are often comprised of millions of user devices and only a subset of available devices can be used for training in each epoch. Designing a device selection strategy is challenging, given that devices are highly heterogeneous in both their system resources and training data. This heterogeneity makes device selection very crucial for timely model convergence and sufficient model accuracy. Existing approaches have addressed system heterogeneity for device selection but have largely ignored the data heterogeneity. In this work, we analyze the impact of data heterogeneity on device selection, model convergence, model accuracy, and fault tolerance in a federated learning setting. Based on our analysis, we propose that clustering devices with similar data distributions followed by selecting the devices with the best processing capacity from each cluster can significantly improve the model convergence without compromising model accuracy. This clustering also guides us in designing policies for fault tolerance in the system. We propose three methods for identifying groups of devices with similar data distributions. We also identify and discuss rich trade-offs between privacy, bandwidth consumption, and computation overhead for each of these proposed methods. Our preliminary experiments show that the proposed methods can provide a 46% - 58% reduction in training time compared to existing approaches in reaching the same accuracy.\",\"PeriodicalId\":389020,\"journal\":{\"name\":\"Proceedings of the 4th International Workshop on Edge Systems, Analytics and Networking\",\"volume\":\"131 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-04-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 4th International Workshop on Edge Systems, Analytics and Networking\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3434770.3459734\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 4th International Workshop on Edge Systems, Analytics and Networking","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3434770.3459734","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 10

摘要

联邦学习是一种保护隐私的机器学习技术，它通过在分布式设备上进行原位模型训练来生成全局共享模型。这些系统通常由数以百万计的用户设备组成，每个时代只有一部分可用设备可用于训练。设计设备选择策略是具有挑战性的，因为设备在其系统资源和训练数据方面都是高度异构的。这种异质性使得设备的选择对模型的及时收敛和足够的模型精度至关重要。现有的方法已经解决了设备选择的系统异质性，但在很大程度上忽略了数据异质性。在这项工作中，我们分析了数据异构对联邦学习设置中设备选择、模型收敛、模型准确性和容错性的影响。基于我们的分析，我们提出将具有相似数据分布的设备聚类，然后从每个集群中选择具有最佳处理能力的设备，可以在不影响模型精度的情况下显著提高模型的收敛性。这种集群还指导我们设计系统中的容错策略。我们提出了三种方法来识别具有相似数据分布的设备组。我们还确定并讨论了这些建议的方法在隐私、带宽消耗和计算开销之间的丰富权衡。我们的初步实验表明，在达到相同精度的情况下，与现有方法相比，所提出的方法可以减少46% - 58%的训练时间。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Accelerated Training via Device Similarity in Federated Learning

Federated Learning is a privacy-preserving, machine learning technique that generates a globally shared model with in-situ model training on distributed devices. These systems are often comprised of millions of user devices and only a subset of available devices can be used for training in each epoch. Designing a device selection strategy is challenging, given that devices are highly heterogeneous in both their system resources and training data. This heterogeneity makes device selection very crucial for timely model convergence and sufficient model accuracy. Existing approaches have addressed system heterogeneity for device selection but have largely ignored the data heterogeneity. In this work, we analyze the impact of data heterogeneity on device selection, model convergence, model accuracy, and fault tolerance in a federated learning setting. Based on our analysis, we propose that clustering devices with similar data distributions followed by selecting the devices with the best processing capacity from each cluster can significantly improve the model convergence without compromising model accuracy. This clustering also guides us in designing policies for fault tolerance in the system. We propose three methods for identifying groups of devices with similar data distributions. We also identify and discuss rich trade-offs between privacy, bandwidth consumption, and computation overhead for each of these proposed methods. Our preliminary experiments show that the proposed methods can provide a 46% - 58% reduction in training time compared to existing approaches in reaching the same accuracy.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 4th International Workshop on Edge Systems, Analytics and Networking

自引率

0.00%

发文量