{"title":"学习最新私有集群状态以提高基于样本的集群调度性能","authors":"Yawen Wang, Qing Wang","doi":"10.1109/DSA.2019.00014","DOIUrl":null,"url":null,"abstract":"Sample based cluster scheduling is considered promising for its high-scalability and low-latency. Its major limitation, on the other hand, is its very limited view of cluster resource state. The limitation confines both its decision precision and the support towards many important scheduling features. There have been several approaches to solve this limitation, yet these works are mostly high-cost solutions that use either extra communication or system component to collect more resource information, which damage the scalability and latency of sample based cluster scheduling. In this paper, we propose L-PCS, a novel learning-based approach based on latest private-cluster-state to generate a relatively accurate knowledge of global cluster state. L-PCS gathers and learns process data of schedulers and predicts a more precise approximation of real-time cluster state for each scheduler. It is a dynamic model updated through time for time-validity. The results predicted by trained model serve as references when schedulers make scheduling decisions. Experiment shows that comparing to sample based schedulers without such learning mechanism, L-PCS improves mean absolute error by 2 × to 3 × and gang scheduling results show a maximum increase of 10.1% to 25.09%.","PeriodicalId":342719,"journal":{"name":"2019 6th International Conference on Dependable Systems and Their Applications (DSA)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Learning Latest Private-Cluster-State to Improve the Performance of Sample-Based Cluster Scheduling\",\"authors\":\"Yawen Wang, Qing Wang\",\"doi\":\"10.1109/DSA.2019.00014\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Sample based cluster scheduling is considered promising for its high-scalability and low-latency. Its major limitation, on the other hand, is its very limited view of cluster resource state. The limitation confines both its decision precision and the support towards many important scheduling features. There have been several approaches to solve this limitation, yet these works are mostly high-cost solutions that use either extra communication or system component to collect more resource information, which damage the scalability and latency of sample based cluster scheduling. In this paper, we propose L-PCS, a novel learning-based approach based on latest private-cluster-state to generate a relatively accurate knowledge of global cluster state. L-PCS gathers and learns process data of schedulers and predicts a more precise approximation of real-time cluster state for each scheduler. It is a dynamic model updated through time for time-validity. The results predicted by trained model serve as references when schedulers make scheduling decisions. Experiment shows that comparing to sample based schedulers without such learning mechanism, L-PCS improves mean absolute error by 2 × to 3 × and gang scheduling results show a maximum increase of 10.1% to 25.09%.\",\"PeriodicalId\":342719,\"journal\":{\"name\":\"2019 6th International Conference on Dependable Systems and Their Applications (DSA)\",\"volume\":\"46 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 6th International Conference on Dependable Systems and Their Applications (DSA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DSA.2019.00014\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 6th International Conference on Dependable Systems and Their Applications (DSA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DSA.2019.00014","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Learning Latest Private-Cluster-State to Improve the Performance of Sample-Based Cluster Scheduling
Sample based cluster scheduling is considered promising for its high-scalability and low-latency. Its major limitation, on the other hand, is its very limited view of cluster resource state. The limitation confines both its decision precision and the support towards many important scheduling features. There have been several approaches to solve this limitation, yet these works are mostly high-cost solutions that use either extra communication or system component to collect more resource information, which damage the scalability and latency of sample based cluster scheduling. In this paper, we propose L-PCS, a novel learning-based approach based on latest private-cluster-state to generate a relatively accurate knowledge of global cluster state. L-PCS gathers and learns process data of schedulers and predicts a more precise approximation of real-time cluster state for each scheduler. It is a dynamic model updated through time for time-validity. The results predicted by trained model serve as references when schedulers make scheduling decisions. Experiment shows that comparing to sample based schedulers without such learning mechanism, L-PCS improves mean absolute error by 2 × to 3 × and gang scheduling results show a maximum increase of 10.1% to 25.09%.