Proceedings of the 1st Workshop on Distributed Machine Learning最新文献

筛选
英文 中文
FLaaS
Proceedings of the 1st Workshop on Distributed Machine Learning Pub Date : 2020-12-01 DOI: 10.1145/3426745.3431337
N. Kourtellis, Kleomenis Katevas, Diego Perino
{"title":"FLaaS","authors":"N. Kourtellis, Kleomenis Katevas, Diego Perino","doi":"10.1145/3426745.3431337","DOIUrl":"https://doi.org/10.1145/3426745.3431337","url":null,"abstract":"Federated Learning (FL) is emerging as a promising technology to build machine learning models in a decentralized, privacy-preserving fashion. Indeed, FL enables local training on user devices, avoiding user data to be transferred to centralized servers, and can be enhanced with differential privacy mechanisms. Although FL has been recently deployed in real systems, the possibility of collaborative modeling across different 3rd-party applications has not yet been explored. In this paper, we tackle this problem and present Federated Learning as a Service (FLaaS), a system enabling different scenarios of 3rd-party application collaborative model building and addressing the consequent challenges of permission and privacy management, usability, and hierarchical model training. FLaaS can be deployed in different operational environments. As a proof of concept, we implement it on a mobile phone setting and discuss practical implications of results on simulated and real devices with respect to on-device training CPU cost, memory footprint and power consumed per FL model round. Therefore, we demonstrate FLaaS's feasibility in building unique or joint FL models across applications for image object detection in a few hours, across 100 devices.","PeriodicalId":301937,"journal":{"name":"Proceedings of the 1st Workshop on Distributed Machine Learning","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125517405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Huffman Coding Based Encoding Techniques for Fast Distributed Deep Learning 基于Huffman编码的快速分布式深度学习编码技术
Proceedings of the 1st Workshop on Distributed Machine Learning Pub Date : 2020-12-01 DOI: 10.1145/3426745.3431334
Rishikesh R. Gajjala, Shashwat Banchhor, A. Abdelmoniem, Aritra Dutta, M. Canini, Panos Kalnis
{"title":"Huffman Coding Based Encoding Techniques for Fast Distributed Deep Learning","authors":"Rishikesh R. Gajjala, Shashwat Banchhor, A. Abdelmoniem, Aritra Dutta, M. Canini, Panos Kalnis","doi":"10.1145/3426745.3431334","DOIUrl":"https://doi.org/10.1145/3426745.3431334","url":null,"abstract":"Distributed stochastic algorithms, equipped with gradient compression techniques, such as codebook quantization, are becoming increasingly popular and considered state-of-the-art in training large deep neural network (DNN) models. However, communicating the quantized gradients in a network requires efficient encoding techniques. For this, practitioners generally use Elias encoding-based techniques without considering their computational overhead or data-volume. In this paper, based on Huffman coding, we propose several lossless encoding techniques that exploit different characteristics of the quantized gradients during distributed DNN training. Then, we show their effectiveness on 5 different DNN models across three different data-sets, and compare them with classic state-of-the-art Elias-based encoding techniques. Our results show that the proposed Huffman-based encoders (i.e., RLH, SH, and SHS) can reduce the encoded data-volume by up to 5.1×, 4.32×, and 3.8×, respectively, compared to the Elias-based encoders.","PeriodicalId":301937,"journal":{"name":"Proceedings of the 1st Workshop on Distributed Machine Learning","volume":"13 12 Pt 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126089349","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Accelerating Intra-Party Communication in Vertical Federated Learning with RDMA 利用RDMA加速垂直联邦学习中的党内交流
Proceedings of the 1st Workshop on Distributed Machine Learning Pub Date : 2020-12-01 DOI: 10.1145/3426745.3431333
Duowen Liu
{"title":"Accelerating Intra-Party Communication in Vertical Federated Learning with RDMA","authors":"Duowen Liu","doi":"10.1145/3426745.3431333","DOIUrl":"https://doi.org/10.1145/3426745.3431333","url":null,"abstract":"Federated learning (FL) has emerged as an elegant privacy-preserving distributed machine learning (ML) paradigm. Particularly, vertical FL (VFL) has a promising application prospect for collaborating organizations owning data of the same set of users but with disjoint features to jointly train models without leaking their private data to each other. As the volume of training data and the model size increase rapidly, each organization may deploy a cluster of many servers to participant in the federation. As such, the intra-party communication cost (i.e., network transfers within each organization's cluster) can significantly impact the entire VFL job's performance. Despite this, existing FL frameworks use the inefficient gRPC for intra-party communication, leading to high latency and high CPU cost. In this paper, we propose a design to transmit data with RDMA for intra-party communication, with no modifications to applications. To improve the network efficiency, we further propose an RDMA usage arbiter to adjust the RDMA bandwidth used for a non-straggler party dynamically, and a query data size optimizer to automatically find out the optimal query data size that each response carries. Our preliminary results show that RDMA based intra-party communication is 10x faster than gRPC based one, leading to a reduction of 9% on the completion time of a VFL job. Moreover, the RDMA usage arbiter can save over 90% bandwidth, and the query data size optimizer can improve the transmission speed by 18%.","PeriodicalId":301937,"journal":{"name":"Proceedings of the 1st Workshop on Distributed Machine Learning","volume":"222 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133360947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
FEWER 更少的
Proceedings of the 1st Workshop on Distributed Machine Learning Pub Date : 2020-12-01 DOI: 10.1145/3426745.3431335
Yongjin Shin, Gihun Lee, Seungjae Shin, Se-Young Yun, Il-Chul Moon
{"title":"FEWER","authors":"Yongjin Shin, Gihun Lee, Seungjae Shin, Se-Young Yun, Il-Chul Moon","doi":"10.1145/3426745.3431335","DOIUrl":"https://doi.org/10.1145/3426745.3431335","url":null,"abstract":"In federated learning, the local devices train the model with their local data, independently; and the server gathers the locally trained model to aggregate them into a shared global model. Therefore, federated learning is an approach to decouple the model training from directly assessing the local data. However, the requirement of periodic communications on model parameters results in a primary bottleneck for the efficiency of federated learning. This work proposes a novel federated learning algorithm, Federated Weight Recovery(FEWER), which enables a sparsely pruned model in the training phase. FEWER starts with the initial model training with an extremely sparse state, and FEWER gradually grows the model capacity until the model reaches a dense model at the end of the training. The level of sparsity becomes the leverage to either increasing the accuracy or decreasing the communication cost, and this sparsification can be beneficial to practitioners. Our experimental results show that FEWER achieves higher test accuracies with less communication costs for most of the test cases.","PeriodicalId":301937,"journal":{"name":"Proceedings of the 1st Workshop on Distributed Machine Learning","volume":"155 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121375692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Maggy 玛吉
Proceedings of the 1st Workshop on Distributed Machine Learning Pub Date : 2020-12-01 DOI: 10.1145/3426745.3431338
Moritz Meister, Sina Sheikholeslami, A. H. Payberah, Vladimir Vlassov, J. Dowling
{"title":"Maggy","authors":"Moritz Meister, Sina Sheikholeslami, A. H. Payberah, Vladimir Vlassov, J. Dowling","doi":"10.1145/3426745.3431338","DOIUrl":"https://doi.org/10.1145/3426745.3431338","url":null,"abstract":"Running extensive experiments is essential for building Machine Learning (ML) models. Such experiments usually require iterative execution of many trials with varying run times. In recent years, Apache Spark has become the de-facto standard for parallel data processing in the industry, in which iterative processes are implemented within the bulk-synchronous parallel (BSP) execution model. The BSP approach is also being used to parallelize ML trials in Spark. However, the BSP task synchronization barriers prevent asynchronous execution of trials, which leads to a reduced number of trials that can be run on a given computational budget. In this paper, we introduce Maggy, an open-source framework based on Spark, to execute ML trials asynchronously in parallel, with the ability to early stop poorly performing trials. In the experiments, we compare Maggy with the BSP execution of parallel trials in Spark and show that on random hyperparameter search on a convolutional neural network for the Fashion-MNIST dataset Maggy reduces the required time to execute a fixed number of trials by 33% to 58%, without any loss in the final model accuracy.","PeriodicalId":301937,"journal":{"name":"Proceedings of the 1st Workshop on Distributed Machine Learning","volume":"107 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116375863","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Neural Enhancement in Content Delivery Systems: The State-of-the-Art and Future Directions 内容传递系统中的神经增强:最新技术和未来方向
Proceedings of the 1st Workshop on Distributed Machine Learning Pub Date : 2020-10-12 DOI: 10.1145/3426745.3431336
Royson Lee, Stylianos I. Venieris, N. Lane
{"title":"Neural Enhancement in Content Delivery Systems: The State-of-the-Art and Future Directions","authors":"Royson Lee, Stylianos I. Venieris, N. Lane","doi":"10.1145/3426745.3431336","DOIUrl":"https://doi.org/10.1145/3426745.3431336","url":null,"abstract":"Internet-enabled smartphones and ultra-wide displays are transforming a variety of visual apps spanning from on-demand movies and 360° videos to video-conferencing and live streaming. However, robustly delivering visual content under fluctuating networking conditions on devices of diverse capabilities remains an open problem. In recent years, advances in the field of deep learning on tasks such as superresolution and image enhancement have led to unprecedented performance in generating high-quality images from low-quality ones, a process we refer to as neural enhancement. In this paper, we survey state-of-the-art content delivery systems that employ neural enhancement as a key component in achieving both fast response time and high visual quality. We first present the deployment challenges of neural enhancement models. We then cover systems targeting diverse use-cases and analyze their design decisions in overcoming technical challenges. Moreover, we present promising directions based on the latest insights from deep learning research to further boost the quality of experience of these systems.","PeriodicalId":301937,"journal":{"name":"Proceedings of the 1st Workshop on Distributed Machine Learning","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121319506","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信