Zhihui Du, Xinning Hui, Yurui Wang, Jun Jiang, Jason Liu, Baokun Lu, Chongyu Wang
{"title":"Inter-Job Scheduling of High-Throughput Material Screening Applications","authors":"Zhihui Du, Xinning Hui, Yurui Wang, Jun Jiang, Jason Liu, Baokun Lu, Chongyu Wang","doi":"10.1109/IPDPS47924.2020.00091","DOIUrl":null,"url":null,"abstract":"Material screening entails a large number of electronic structure simulations. Traditionally, these simulation runs are treated separately as solving independent Kohn-Sham (KS) equations. In this paper, we formulate material screening as an inter-job scheduling problem for solving a system of KS equations, and in doing so allowing one to explore different scheduling methods that use the results of some equations to expedite the solution of others. We propose the concept of sharing iterative simulation and employ several optimization methods to initialize a simulation run using the distribution of particles from similar jobs as the initial condition. More specifically, we propose two similarity metrics, one qualitative and the other quantitative, to predict the simulation runtime of a material screen job based on its similarity to other jobs. Accordingly, we present two inter-job scheduling algorithms that make use the qualitative and quantitative similarity information. We conducted extensive experiments on the Sunway TaihuLight supercomputer for a practical material screening problem to evaluate the performance of the two scheduling algorithms using the proposed similarity metrics. We show that the total time required to run the large number of material screening jobs can be significantly reduced, and the algorithms are robust even with moderate inaccurate prediction on the simulation runtime. The quantitative algorithm achieves better results than the qualitative algorithm using more accurate prediction and thus achieving more significant runtime reduction.","PeriodicalId":6805,"journal":{"name":"2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","volume":"7 1","pages":"841-852"},"PeriodicalIF":0.0000,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPS47924.2020.00091","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Material screening entails a large number of electronic structure simulations. Traditionally, these simulation runs are treated separately as solving independent Kohn-Sham (KS) equations. In this paper, we formulate material screening as an inter-job scheduling problem for solving a system of KS equations, and in doing so allowing one to explore different scheduling methods that use the results of some equations to expedite the solution of others. We propose the concept of sharing iterative simulation and employ several optimization methods to initialize a simulation run using the distribution of particles from similar jobs as the initial condition. More specifically, we propose two similarity metrics, one qualitative and the other quantitative, to predict the simulation runtime of a material screen job based on its similarity to other jobs. Accordingly, we present two inter-job scheduling algorithms that make use the qualitative and quantitative similarity information. We conducted extensive experiments on the Sunway TaihuLight supercomputer for a practical material screening problem to evaluate the performance of the two scheduling algorithms using the proposed similarity metrics. We show that the total time required to run the large number of material screening jobs can be significantly reduced, and the algorithms are robust even with moderate inaccurate prediction on the simulation runtime. The quantitative algorithm achieves better results than the qualitative algorithm using more accurate prediction and thus achieving more significant runtime reduction.