{"title":"自主重调度MPI程序的运行时系统","authors":"C. Du, Sudeshna Ghosh, S. Shankar, Xian-He Sun","doi":"10.1109/ICPP.2004.1327898","DOIUrl":null,"url":null,"abstract":"Intensive research has been conducted on dynamic job scheduling, which dynamically allocates jobs to computing systems. However, most of the existing work is limited to redistribute independent tasks or at the algorithm design level. There is no runtime system available to support automatic redistribution of a running process in a heterogeneous network environment. In this study, we present the design and implementation of a system that dynamically reschedules running processes over a network of computing resources via automatic decision-making and process migration. The system is implemented on top of MPI-2 and HPCM (high performance computing mobility) middleware. Experimental and analytical results show that the runtime system works well. It makes dynamic rescheduling of running tasks possible and improves system performance considerably. While the implementation is for MPI programs and using HPCM, the design of the system is general and can be extended to other distributed environments as well.","PeriodicalId":106240,"journal":{"name":"International Conference on Parallel Processing, 2004. ICPP 2004.","volume":"101 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2004-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"Runtime system for autonomic rescheduling of MPI programs\",\"authors\":\"C. Du, Sudeshna Ghosh, S. Shankar, Xian-He Sun\",\"doi\":\"10.1109/ICPP.2004.1327898\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Intensive research has been conducted on dynamic job scheduling, which dynamically allocates jobs to computing systems. However, most of the existing work is limited to redistribute independent tasks or at the algorithm design level. There is no runtime system available to support automatic redistribution of a running process in a heterogeneous network environment. In this study, we present the design and implementation of a system that dynamically reschedules running processes over a network of computing resources via automatic decision-making and process migration. The system is implemented on top of MPI-2 and HPCM (high performance computing mobility) middleware. Experimental and analytical results show that the runtime system works well. It makes dynamic rescheduling of running tasks possible and improves system performance considerably. While the implementation is for MPI programs and using HPCM, the design of the system is general and can be extended to other distributed environments as well.\",\"PeriodicalId\":106240,\"journal\":{\"name\":\"International Conference on Parallel Processing, 2004. ICPP 2004.\",\"volume\":\"101 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2004-08-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Conference on Parallel Processing, 2004. ICPP 2004.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICPP.2004.1327898\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Parallel Processing, 2004. ICPP 2004.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPP.2004.1327898","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Runtime system for autonomic rescheduling of MPI programs
Intensive research has been conducted on dynamic job scheduling, which dynamically allocates jobs to computing systems. However, most of the existing work is limited to redistribute independent tasks or at the algorithm design level. There is no runtime system available to support automatic redistribution of a running process in a heterogeneous network environment. In this study, we present the design and implementation of a system that dynamically reschedules running processes over a network of computing resources via automatic decision-making and process migration. The system is implemented on top of MPI-2 and HPCM (high performance computing mobility) middleware. Experimental and analytical results show that the runtime system works well. It makes dynamic rescheduling of running tasks possible and improves system performance considerably. While the implementation is for MPI programs and using HPCM, the design of the system is general and can be extended to other distributed environments as well.