A Comparison of Two Approaches to Parallel Simulation of Multiprocessors

A. Over, Bill Clarke, P. Strazdins
{"title":"A Comparison of Two Approaches to Parallel Simulation of Multiprocessors","authors":"A. Over, Bill Clarke, P. Strazdins","doi":"10.1109/ISPASS.2007.363732","DOIUrl":null,"url":null,"abstract":"The design trend towards CMPs has made the simulation of multiprocessor systems a necessity and has also made multiprocessor systems widely available. While a serial multiprocessor simulation necessarily imposes a linear slowdown, running such a simulation in parallel may help mitigate this effect. In this paper we document our experiences with two different methods of parallelizing Sparc Sulima, a simulator of UltraSPARC IIICu-based multiprocessor systems. In the first approach, a simple interconnect model within the simulator is parallelized non-deterministically using careful locking. In the second, a detailed interconnect model is parallelized while preserving determinism using parallel discrete event simulation (PDES) techniques. While both approaches demonstrate a threefold speedup using 4 threads on workloads from the NAS parallel benchmarks, speedup proved constrained by load-balancing between simulated processors. A theoretical model is developed to help understand why observed speedup is less than ideal. An analysis of the related speed-accuracy tradeoff in the first approach with respect to the simulation time quantum is also given; the results show that, for both serial and parallel simulation, a quantum in the order of a few hundreds of cycles represents a `sweet-spot', but parallel simulation is significantly more accurate for a given quantum size. As with the speedup analysis, these effects are workload dependent","PeriodicalId":439151,"journal":{"name":"2007 IEEE International Symposium on Performance Analysis of Systems & Software","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 IEEE International Symposium on Performance Analysis of Systems & Software","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISPASS.2007.363732","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12

Abstract

The design trend towards CMPs has made the simulation of multiprocessor systems a necessity and has also made multiprocessor systems widely available. While a serial multiprocessor simulation necessarily imposes a linear slowdown, running such a simulation in parallel may help mitigate this effect. In this paper we document our experiences with two different methods of parallelizing Sparc Sulima, a simulator of UltraSPARC IIICu-based multiprocessor systems. In the first approach, a simple interconnect model within the simulator is parallelized non-deterministically using careful locking. In the second, a detailed interconnect model is parallelized while preserving determinism using parallel discrete event simulation (PDES) techniques. While both approaches demonstrate a threefold speedup using 4 threads on workloads from the NAS parallel benchmarks, speedup proved constrained by load-balancing between simulated processors. A theoretical model is developed to help understand why observed speedup is less than ideal. An analysis of the related speed-accuracy tradeoff in the first approach with respect to the simulation time quantum is also given; the results show that, for both serial and parallel simulation, a quantum in the order of a few hundreds of cycles represents a `sweet-spot', but parallel simulation is significantly more accurate for a given quantum size. As with the speedup analysis, these effects are workload dependent
多处理器并行仿真的两种方法比较
cmp的设计趋势使多处理器系统的仿真成为必要,也使多处理器系统得到广泛应用。虽然串行多处理器模拟必然会造成线性减速,但并行运行这样的模拟可能有助于减轻这种影响。在本文中,我们记录了我们使用两种不同方法并行化Sparc Sulima的经验,Sparc Sulima是一种基于UltraSPARC iiicu的多处理器系统模拟器。在第一种方法中,模拟器中的简单互连模型使用谨慎的锁定进行非确定性并行化。其次,在保持确定性的同时,使用并行离散事件模拟(PDES)技术对详细的互连模型进行并行化。虽然这两种方法在NAS并行基准测试的工作负载上使用4个线程都可以实现三倍的加速,但事实证明,加速受到模拟处理器之间负载平衡的限制。开发了一个理论模型来帮助理解为什么观察到的加速不是理想的。本文还分析了第一种方法在仿真时间量子方面的速度-精度权衡;结果表明,对于串行和并行模拟,以几百个周期为顺序的量子代表一个“最佳点”,但并行模拟对于给定量子大小明显更准确。与加速分析一样,这些影响取决于工作负载
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信