{"title":"数据移动与集群计算调度的协调","authors":"John Bent, D. Rotem, A. Romosan, A. Shoshani","doi":"10.1109/CLADE.2005.1520896","DOIUrl":null,"url":null,"abstract":"We are looking at the problem of scheduling compute tasks on a cluster of servers. These tasks require files that reside on a remote archive, and may also be cached on some subset of the servers. A task can only be run on a server that has the files it requires. This introduces the problem of scheduling data movement in coordination with the scheduling of computation. Our goal is to maximize throughput while minimizing data movement. FIFO scheduling is not efficient in this situation due to its lack of awareness of the data movement required. We looked at two other strategies, called shortest job first and linear programming based optimization, and compared them under various configurations.","PeriodicalId":330715,"journal":{"name":"CLADE 2005. Proceedings Challenges of Large Applications in Distributed Environments, 2005.","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2005-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":"{\"title\":\"Coordination of data movement with computation scheduling on a cluster\",\"authors\":\"John Bent, D. Rotem, A. Romosan, A. Shoshani\",\"doi\":\"10.1109/CLADE.2005.1520896\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We are looking at the problem of scheduling compute tasks on a cluster of servers. These tasks require files that reside on a remote archive, and may also be cached on some subset of the servers. A task can only be run on a server that has the files it requires. This introduces the problem of scheduling data movement in coordination with the scheduling of computation. Our goal is to maximize throughput while minimizing data movement. FIFO scheduling is not efficient in this situation due to its lack of awareness of the data movement required. We looked at two other strategies, called shortest job first and linear programming based optimization, and compared them under various configurations.\",\"PeriodicalId\":330715,\"journal\":{\"name\":\"CLADE 2005. Proceedings Challenges of Large Applications in Distributed Environments, 2005.\",\"volume\":\"18 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2005-07-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"13\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"CLADE 2005. Proceedings Challenges of Large Applications in Distributed Environments, 2005.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CLADE.2005.1520896\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"CLADE 2005. Proceedings Challenges of Large Applications in Distributed Environments, 2005.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CLADE.2005.1520896","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Coordination of data movement with computation scheduling on a cluster
We are looking at the problem of scheduling compute tasks on a cluster of servers. These tasks require files that reside on a remote archive, and may also be cached on some subset of the servers. A task can only be run on a server that has the files it requires. This introduces the problem of scheduling data movement in coordination with the scheduling of computation. Our goal is to maximize throughput while minimizing data movement. FIFO scheduling is not efficient in this situation due to its lack of awareness of the data movement required. We looked at two other strategies, called shortest job first and linear programming based optimization, and compared them under various configurations.