Exploring Decentralized Collaboration in Heterogeneous Edge Training

2020 IEEE/ACM Symposium on Edge Computing (SEC) Pub Date : 2020-11-01 DOI:10.1109/SEC50012.2020.00069

Xiang Chen, Zhuwei Qin

{"title":"Exploring Decentralized Collaboration in Heterogeneous Edge Training","authors":"Xiang Chen, Zhuwei Qin","doi":"10.1109/SEC50012.2020.00069","DOIUrl":null,"url":null,"abstract":"Recent progress in deep learning techniques enabled collaborative edge training, which usually deploys identical neural network models globally on multiple devices for aggregating parameter updates over distributed data collection. However, as more and more heterogeneous edge devices are involved in practical training, the identical model deployment over collaborative edge devices cannot be guaranteed: On one hand, the weak edge devices with less computation resources may not catch up stronger ones’ training progress, and appropriate local model training customization is necessary to balance the collaboration. On the other hand, a particular local edge device may have specific learning task preference, while the global identical model would exceed the practical local demand and cause unnecessary computation cost. Therefore, we explored the collaborative learning with heterogeneous convolutional neural networks (CNNs) in this work, expecting to address aforementioned real problems. Specifically, we proposed a novel decentralized collaborative training method by decoupling a training target CNN model into independently trainable sub-models correspond to a sub-set of learning tasks for each edge device. After sub-models are well-trained on edge nodes, the model parameters for individual learning tasks can be harvested from local models on every edge device and ensemble the global training model back to a single piece. Experiments demonstrate that, for the AlexNet and VGG on the CIFAR10, CIFAR100 and KWS dataset, our decentralized training method can save up to 11.8× less computation load while achieve central sever test accuracy.","PeriodicalId":375577,"journal":{"name":"2020 IEEE/ACM Symposium on Edge Computing (SEC)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE/ACM Symposium on Edge Computing (SEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SEC50012.2020.00069","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

Recent progress in deep learning techniques enabled collaborative edge training, which usually deploys identical neural network models globally on multiple devices for aggregating parameter updates over distributed data collection. However, as more and more heterogeneous edge devices are involved in practical training, the identical model deployment over collaborative edge devices cannot be guaranteed: On one hand, the weak edge devices with less computation resources may not catch up stronger ones’ training progress, and appropriate local model training customization is necessary to balance the collaboration. On the other hand, a particular local edge device may have specific learning task preference, while the global identical model would exceed the practical local demand and cause unnecessary computation cost. Therefore, we explored the collaborative learning with heterogeneous convolutional neural networks (CNNs) in this work, expecting to address aforementioned real problems. Specifically, we proposed a novel decentralized collaborative training method by decoupling a training target CNN model into independently trainable sub-models correspond to a sub-set of learning tasks for each edge device. After sub-models are well-trained on edge nodes, the model parameters for individual learning tasks can be harvested from local models on every edge device and ensemble the global training model back to a single piece. Experiments demonstrate that, for the AlexNet and VGG on the CIFAR10, CIFAR100 and KWS dataset, our decentralized training method can save up to 11.8× less computation load while achieve central sever test accuracy.

查看原文本刊更多论文

探索异构边缘训练中的分散协作

深度学习技术的最新进展使协作边缘训练成为可能，它通常在多个设备上部署相同的神经网络模型，以便在分布式数据收集上聚合参数更新。然而，随着实际训练中涉及的异构边缘设备越来越多，协作边缘设备上的模型部署无法保证相同:一方面，计算资源较少的弱边缘设备可能跟不上强边缘设备的训练进度，需要适当的局部模型训练定制来平衡协作。另一方面，特定的局部边缘设备可能具有特定的学习任务偏好，而全局相同模型会超出实际的局部需求，造成不必要的计算成本。因此，我们在这项工作中探索了异构卷积神经网络(cnn)的协同学习，期望解决上述实际问题。具体而言，我们提出了一种新的分散协同训练方法，将训练目标CNN模型解耦为每个边缘设备的学习任务子集对应的独立可训练子模型。在边缘节点上训练好子模型后，可以从每个边缘设备上的局部模型中获取单个学习任务的模型参数，并将全局训练模型集成回单个模型。实验表明，对于CIFAR10、CIFAR100和KWS数据集上的AlexNet和VGG，我们的分散训练方法在达到中心服务器测试精度的同时，可以节省高达11.8倍的计算量。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2020 IEEE/ACM Symposium on Edge Computing (SEC)

自引率

0.00%

发文量