Dynamic Straggler Mitigation for Large-Scale Spatial Simulations

IF 1.2 Q4 REMOTE SENSING
Eman Bin Khunayn, Hairuo Xie, S. Karunasekera, K. Ramamohanarao
{"title":"Dynamic Straggler Mitigation for Large-Scale Spatial Simulations","authors":"Eman Bin Khunayn, Hairuo Xie, S. Karunasekera, K. Ramamohanarao","doi":"10.1145/3578933","DOIUrl":null,"url":null,"abstract":"Spatial simulations have been widely used to study real-world environments, such as transportation systems. Applications like prediction and analysis of transportation require the simulation to handle millions of objects while running faster than real time. Running such large-scale simulation requires high computational power, which can be provided through parallel distributed computing. Implementations of parallel distributed spatial simulations usually follow a bulk synchronous parallel (BSP) model to ensure the correctness of simulation. The processing in BSP is divided into iterations of computation and communication, running on multiple workers, followed by a global barrier synchronisation to ensure that all communications are concluded. Unfortunately, the BSP model is plagued by the straggler problem, where a delay in any worker slows down the entire simulation. Stragglers may occur for many reasons, including imbalanced workload distribution or communication and synchronisation delays. The straggler problem can become more severe with increasing parallelism and continuous change of workload distribution among workers. This article proposes methods to dynamically mitigate stragglers and tackle communication delays. The proposed strategies can rebalance the workload distribution during simulation. These methods employ the spatial properties of the simulated environments to combine a flexible synchronisation model with decentralised dynamic load balancing and on-demand resource allocation. All proposed methods are implemented and evaluated using a microscopic traffic simulator as an example of large-scale spatial simulations. We run traffic simulations for Melbourne, Beijing and New York with different straggler scenarios. Our methods significantly improve simulation performance compared to advanced methods such as global dynamic load balancing.","PeriodicalId":43641,"journal":{"name":"ACM Transactions on Spatial Algorithms and Systems","volume":null,"pages":null},"PeriodicalIF":1.2000,"publicationDate":"2023-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Spatial Algorithms and Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3578933","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"REMOTE SENSING","Score":null,"Total":0}
引用次数: 1

Abstract

Spatial simulations have been widely used to study real-world environments, such as transportation systems. Applications like prediction and analysis of transportation require the simulation to handle millions of objects while running faster than real time. Running such large-scale simulation requires high computational power, which can be provided through parallel distributed computing. Implementations of parallel distributed spatial simulations usually follow a bulk synchronous parallel (BSP) model to ensure the correctness of simulation. The processing in BSP is divided into iterations of computation and communication, running on multiple workers, followed by a global barrier synchronisation to ensure that all communications are concluded. Unfortunately, the BSP model is plagued by the straggler problem, where a delay in any worker slows down the entire simulation. Stragglers may occur for many reasons, including imbalanced workload distribution or communication and synchronisation delays. The straggler problem can become more severe with increasing parallelism and continuous change of workload distribution among workers. This article proposes methods to dynamically mitigate stragglers and tackle communication delays. The proposed strategies can rebalance the workload distribution during simulation. These methods employ the spatial properties of the simulated environments to combine a flexible synchronisation model with decentralised dynamic load balancing and on-demand resource allocation. All proposed methods are implemented and evaluated using a microscopic traffic simulator as an example of large-scale spatial simulations. We run traffic simulations for Melbourne, Beijing and New York with different straggler scenarios. Our methods significantly improve simulation performance compared to advanced methods such as global dynamic load balancing.
大尺度空间模拟的动态离散体缓解
空间模拟已被广泛用于研究现实世界的环境,如交通系统。像交通预测和分析这样的应用程序需要模拟处理数百万个对象,同时运行速度比实时更快。运行如此大规模的仿真需要很高的计算能力,这可以通过并行分布式计算来提供。并行分布式空间仿真的实现通常采用批量同步并行(BSP)模型,以保证仿真的正确性。BSP中的处理分为计算和通信的迭代,在多个worker上运行,然后是全局屏障同步,以确保所有通信都完成。不幸的是,BSP模型受到离散问题的困扰,其中任何工作的延迟都会减慢整个模拟的速度。掉队的发生可能有很多原因,包括工作负载分布不平衡或通信和同步延迟。随着并行度的提高和工作负荷分配的不断变化,掉队问题会变得更加严重。本文提出了动态减少掉队和处理通信延迟的方法。所提出的策略可以在模拟过程中重新平衡工作负载分配。这些方法利用模拟环境的空间特性,将灵活的同步模型与分散的动态负载平衡和按需资源分配相结合。采用微观交通模拟器作为大尺度空间模拟的实例,对所有提出的方法进行了实现和评估。我们对墨尔本、北京和纽约的交通进行了模拟,模拟了不同的离散场景。与全局动态负载平衡等先进方法相比,我们的方法显著提高了仿真性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
4.40
自引率
5.30%
发文量
43
期刊介绍: ACM Transactions on Spatial Algorithms and Systems (TSAS) is a scholarly journal that publishes the highest quality papers on all aspects of spatial algorithms and systems and closely related disciplines. It has a multi-disciplinary perspective in that it spans a large number of areas where spatial data is manipulated or visualized (regardless of how it is specified - i.e., geometrically or textually) such as geography, geographic information systems (GIS), geospatial and spatiotemporal databases, spatial and metric indexing, location-based services, web-based spatial applications, geographic information retrieval (GIR), spatial reasoning and mining, security and privacy, as well as the related visual computing areas of computer graphics, computer vision, geometric modeling, and visualization where the spatial, geospatial, and spatiotemporal data is central.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信