{"title":"联邦学习有多异步?","authors":"Ningxin Su, Baochun Li","doi":"10.1109/IWQoS54832.2022.9812885","DOIUrl":null,"url":null,"abstract":"As a practical paradigm designed to involve large numbers of edge devices in distributed training of deep learning models, federated learning has witnessed a significant amount of research attention in the recent years. Yet, most existing mechanisms on federated learning assumed either fully synchronous or asynchronous communication strategies between clients and the federated learning server. Existing designs that were partially asynchronous in their communication were simple heuristics, and were evaluated using the number of communication rounds or updates required for convergence, rather than the wall-clock time in practice.In this paper, we seek to explore the entire design space between fully synchronous and asynchronous mechanisms of communication. Based on insights from our exploration, we propose Port, a new partially asynchronous mechanism designed to allow fast clients to aggregate asynchronously, yet without waiting excessively for the slower ones. In addition, Port is designed to adjust the aggregation weights based on both the staleness and divergence of model updates, with provable convergence guarantees. We have implemented Port and its leading competitors in Plato, an open-source scalable federated learning research framework designed from the ground up to emulate real-world scenarios. With respect to the wall-clock time it takes for converging to the target accuracy, Port outperformed its closest competitor, FedBuff, by up to 40% in our experiments.","PeriodicalId":353365,"journal":{"name":"2022 IEEE/ACM 30th International Symposium on Quality of Service (IWQoS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"How Asynchronous can Federated Learning Be?\",\"authors\":\"Ningxin Su, Baochun Li\",\"doi\":\"10.1109/IWQoS54832.2022.9812885\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As a practical paradigm designed to involve large numbers of edge devices in distributed training of deep learning models, federated learning has witnessed a significant amount of research attention in the recent years. Yet, most existing mechanisms on federated learning assumed either fully synchronous or asynchronous communication strategies between clients and the federated learning server. Existing designs that were partially asynchronous in their communication were simple heuristics, and were evaluated using the number of communication rounds or updates required for convergence, rather than the wall-clock time in practice.In this paper, we seek to explore the entire design space between fully synchronous and asynchronous mechanisms of communication. Based on insights from our exploration, we propose Port, a new partially asynchronous mechanism designed to allow fast clients to aggregate asynchronously, yet without waiting excessively for the slower ones. In addition, Port is designed to adjust the aggregation weights based on both the staleness and divergence of model updates, with provable convergence guarantees. We have implemented Port and its leading competitors in Plato, an open-source scalable federated learning research framework designed from the ground up to emulate real-world scenarios. With respect to the wall-clock time it takes for converging to the target accuracy, Port outperformed its closest competitor, FedBuff, by up to 40% in our experiments.\",\"PeriodicalId\":353365,\"journal\":{\"name\":\"2022 IEEE/ACM 30th International Symposium on Quality of Service (IWQoS)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-06-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE/ACM 30th International Symposium on Quality of Service (IWQoS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IWQoS54832.2022.9812885\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE/ACM 30th International Symposium on Quality of Service (IWQoS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IWQoS54832.2022.9812885","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
As a practical paradigm designed to involve large numbers of edge devices in distributed training of deep learning models, federated learning has witnessed a significant amount of research attention in the recent years. Yet, most existing mechanisms on federated learning assumed either fully synchronous or asynchronous communication strategies between clients and the federated learning server. Existing designs that were partially asynchronous in their communication were simple heuristics, and were evaluated using the number of communication rounds or updates required for convergence, rather than the wall-clock time in practice.In this paper, we seek to explore the entire design space between fully synchronous and asynchronous mechanisms of communication. Based on insights from our exploration, we propose Port, a new partially asynchronous mechanism designed to allow fast clients to aggregate asynchronously, yet without waiting excessively for the slower ones. In addition, Port is designed to adjust the aggregation weights based on both the staleness and divergence of model updates, with provable convergence guarantees. We have implemented Port and its leading competitors in Plato, an open-source scalable federated learning research framework designed from the ground up to emulate real-world scenarios. With respect to the wall-clock time it takes for converging to the target accuracy, Port outperformed its closest competitor, FedBuff, by up to 40% in our experiments.