Qiu Zhen, Fan Xu, Wenpu Li, Fan Yang, Hongyu Wu, Huanhuan Li
{"title":"基于深度学习的分布式异构任务调度和资源分配算法研究","authors":"Qiu Zhen, Fan Xu, Wenpu Li, Fan Yang, Hongyu Wu, Huanhuan Li","doi":"10.1117/12.3032073","DOIUrl":null,"url":null,"abstract":"With the rapid development and application of deep learning, its dataset size and network model are becoming increasingly large, and distributed model training is becoming increasingly popular. This article proposes a distributed heterogeneous task scheduling and resource allocation algorithm based on deep learning to address issues such as heterogeneity in resource usage, inability to predict task convergence time, communication time bottlenecks, and resource waste caused by static resource allocation during distributed collaborative training. This algorithm achieves dynamic scheduling and resource allocation of heterogeneous tasks and reduces task completion time in clusters. The experiment shows that the algorithm proposed in this article has significant improvements in both task completion time and system duration.","PeriodicalId":198425,"journal":{"name":"Other Conferences","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Research on distributed heterogeneous task scheduling and resource allocation algorithms based on deep learning\",\"authors\":\"Qiu Zhen, Fan Xu, Wenpu Li, Fan Yang, Hongyu Wu, Huanhuan Li\",\"doi\":\"10.1117/12.3032073\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the rapid development and application of deep learning, its dataset size and network model are becoming increasingly large, and distributed model training is becoming increasingly popular. This article proposes a distributed heterogeneous task scheduling and resource allocation algorithm based on deep learning to address issues such as heterogeneity in resource usage, inability to predict task convergence time, communication time bottlenecks, and resource waste caused by static resource allocation during distributed collaborative training. This algorithm achieves dynamic scheduling and resource allocation of heterogeneous tasks and reduces task completion time in clusters. The experiment shows that the algorithm proposed in this article has significant improvements in both task completion time and system duration.\",\"PeriodicalId\":198425,\"journal\":{\"name\":\"Other Conferences\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-06-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Other Conferences\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1117/12.3032073\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Other Conferences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1117/12.3032073","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Research on distributed heterogeneous task scheduling and resource allocation algorithms based on deep learning
With the rapid development and application of deep learning, its dataset size and network model are becoming increasingly large, and distributed model training is becoming increasingly popular. This article proposes a distributed heterogeneous task scheduling and resource allocation algorithm based on deep learning to address issues such as heterogeneity in resource usage, inability to predict task convergence time, communication time bottlenecks, and resource waste caused by static resource allocation during distributed collaborative training. This algorithm achieves dynamic scheduling and resource allocation of heterogeneous tasks and reduces task completion time in clusters. The experiment shows that the algorithm proposed in this article has significant improvements in both task completion time and system duration.