Ziyi Han, Ruiting Zhou, Jinlong Pang, Yue Cao, Haisheng Tan
{"title":"无线边缘网络的在线调度无偏分布式学习","authors":"Ziyi Han, Ruiting Zhou, Jinlong Pang, Yue Cao, Haisheng Tan","doi":"10.1109/ICPADS53394.2021.00080","DOIUrl":null,"url":null,"abstract":"To realize high quality smart IoT services, such as intelligent video surveillance in Auto Driving and Smart City, tremendous amount of distributed machine learning jobs train unbiased models in wireless edge networks, adopting the parameter server (PS) architecture. Due to the large datasets collected geo-distributedly, the training of unbiased distributed learning (UDL) brings high response latency and bandwidth consumption. In this paper, we propose an online scheduling algorithm, Okita, to minimize both the latency cost and bandwidth cost in UDL. Okita schedules UDL jobs at each time slot to jointly decide the execution time window, the amount of training data, the number and the location of concurrent workers and PSs in each site. To evaluate the practical performance of Okita, we implement a testbed based on Kubernetes. Extensive experiments and simulations show that Okita can reduce up to 60% of total cost, compared with the state-of-the-art schedulers in cloud systems.","PeriodicalId":309508,"journal":{"name":"2021 IEEE 27th International Conference on Parallel and Distributed Systems (ICPADS)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Online Scheduling Unbiased Distributed Learning over Wireless Edge Networks\",\"authors\":\"Ziyi Han, Ruiting Zhou, Jinlong Pang, Yue Cao, Haisheng Tan\",\"doi\":\"10.1109/ICPADS53394.2021.00080\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"To realize high quality smart IoT services, such as intelligent video surveillance in Auto Driving and Smart City, tremendous amount of distributed machine learning jobs train unbiased models in wireless edge networks, adopting the parameter server (PS) architecture. Due to the large datasets collected geo-distributedly, the training of unbiased distributed learning (UDL) brings high response latency and bandwidth consumption. In this paper, we propose an online scheduling algorithm, Okita, to minimize both the latency cost and bandwidth cost in UDL. Okita schedules UDL jobs at each time slot to jointly decide the execution time window, the amount of training data, the number and the location of concurrent workers and PSs in each site. To evaluate the practical performance of Okita, we implement a testbed based on Kubernetes. Extensive experiments and simulations show that Okita can reduce up to 60% of total cost, compared with the state-of-the-art schedulers in cloud systems.\",\"PeriodicalId\":309508,\"journal\":{\"name\":\"2021 IEEE 27th International Conference on Parallel and Distributed Systems (ICPADS)\",\"volume\":\"24 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE 27th International Conference on Parallel and Distributed Systems (ICPADS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICPADS53394.2021.00080\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 27th International Conference on Parallel and Distributed Systems (ICPADS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPADS53394.2021.00080","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Online Scheduling Unbiased Distributed Learning over Wireless Edge Networks
To realize high quality smart IoT services, such as intelligent video surveillance in Auto Driving and Smart City, tremendous amount of distributed machine learning jobs train unbiased models in wireless edge networks, adopting the parameter server (PS) architecture. Due to the large datasets collected geo-distributedly, the training of unbiased distributed learning (UDL) brings high response latency and bandwidth consumption. In this paper, we propose an online scheduling algorithm, Okita, to minimize both the latency cost and bandwidth cost in UDL. Okita schedules UDL jobs at each time slot to jointly decide the execution time window, the amount of training data, the number and the location of concurrent workers and PSs in each site. To evaluate the practical performance of Okita, we implement a testbed based on Kubernetes. Extensive experiments and simulations show that Okita can reduce up to 60% of total cost, compared with the state-of-the-art schedulers in cloud systems.