Eman Bin Khunayn, S. Karunasekera, Hairuo Xie, K. Ramamohanarao
{"title":"Exploiting Data Dependency to Mitigate Stragglers in Distributed Spatial Simulation","authors":"Eman Bin Khunayn, S. Karunasekera, Hairuo Xie, K. Ramamohanarao","doi":"10.1145/3139958.3140018","DOIUrl":null,"url":null,"abstract":"Distributed spatial simulations commonly employ Bulk Synchronous Parallel model (BSP) implementation. However, implementations using BSP are usually fraught with the straggler problem, where the delay of any worker slows down the entire system. Random stragglers commonly occur due to many reasons: imbalanced workload, operating system scheduling, or communication delays. The straggler problem is further exasperated with increasing parallelism. To reduce the straggler problem and preserve simplicity and scalability advantages of the BSP model, we propose a new parallel model, which we call Priority Asynchronous Parallel (PAP) model. PAP exploits data dependencies of parallel processes to be computed and synchronized based on data priority to the other workers. For further computational improvement, we develop a load balancing and partitioning method, called GridGraph that utilizes the spatial and connectivity properties of the simulation space to reduce the size of exchanged data in addition to balancing the workload among workers. The proposed schemes are implemented and evaluated in a microscopic traffic simulator. Running traffic simulation for Melbourne, Beijing, and New York cities on 80 workers, the simulation achieves a performance speedup of around 47.4% for Melbourne, 52.18% for Beijing, and 65.84% for New York, using PAP model combined with GridGraph partitioning compared to BSP model.","PeriodicalId":270649,"journal":{"name":"Proceedings of the 25th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 25th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3139958.3140018","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9
Abstract
Distributed spatial simulations commonly employ Bulk Synchronous Parallel model (BSP) implementation. However, implementations using BSP are usually fraught with the straggler problem, where the delay of any worker slows down the entire system. Random stragglers commonly occur due to many reasons: imbalanced workload, operating system scheduling, or communication delays. The straggler problem is further exasperated with increasing parallelism. To reduce the straggler problem and preserve simplicity and scalability advantages of the BSP model, we propose a new parallel model, which we call Priority Asynchronous Parallel (PAP) model. PAP exploits data dependencies of parallel processes to be computed and synchronized based on data priority to the other workers. For further computational improvement, we develop a load balancing and partitioning method, called GridGraph that utilizes the spatial and connectivity properties of the simulation space to reduce the size of exchanged data in addition to balancing the workload among workers. The proposed schemes are implemented and evaluated in a microscopic traffic simulator. Running traffic simulation for Melbourne, Beijing, and New York cities on 80 workers, the simulation achieves a performance speedup of around 47.4% for Melbourne, 52.18% for Beijing, and 65.84% for New York, using PAP model combined with GridGraph partitioning compared to BSP model.