{"title":"A Performance Evaluation Methodology for Parallel Simulation Protocols","authors":"V. Jha, R. Bagrodia","doi":"10.1109/PADS.1996.761576","DOIUrl":"https://doi.org/10.1109/PADS.1996.761576","url":null,"abstract":"Most experimental studies of the performance of parallel simulation protocols use speedup or number of events processed per unit time as the performance metric. Although helpful in evaluating the usefulness of parallel simulation for a given simulation model, these metrics tell us little about the efficiency of the simulation protocol used. In this paper, we describe an Ideal Simulation Protocol (ISP), based on the concept of critical path, which experimentally computes the best possible execution time for a simulation model on a given parallel architecture. Since ISP computes the bound by actually executing the model on the given parallel architecture, it is much more realistic than that computed by a uniprocessor critical path analysis. The paper illustrates, using parameterized synthetic benchmarks, how an ISP-based performance evaluation can lead to much better insights into the performance of parallel simulation protocols than what would be gained from speedup graphs alone.","PeriodicalId":326232,"journal":{"name":"Proceedings of Symposium on Parallel and Distributed Tools","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1996-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117017140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Estimating the Cost of Throttled Execution in Time Warp","authors":"Samir R Das","doi":"10.1109/PADS.1996.761577","DOIUrl":"https://doi.org/10.1109/PADS.1996.761577","url":null,"abstract":"Over-optimistic execution has long been identified as a major performance bottleneck in Time Warp based parallel simulation systems. An appropriate throttle or control of optimism can improve performance by reducing the number of rollbacks. However, the design of an appropriate throttle is a difficult task, as correct computations on the critical path may be blocked, thus increasing the overall execution time. In this paper we build a cost model for throttled execution that involves both rollback probability and probability for an event computation being on the critical path. The model can estimate an appropriate size of time window for a throttled execution using statistics collected from the purely optimistic execution. The model is validated by an experimental study with a set of synthetic workloads.","PeriodicalId":326232,"journal":{"name":"Proceedings of Symposium on Parallel and Distributed Tools","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1996-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115471100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Dynamic Load Balancing of Clustered Time Warp for Logic Simulation","authors":"Hervé Avril, C. Tropper","doi":"10.1145/238788.238804","DOIUrl":"https://doi.org/10.1145/238788.238804","url":null,"abstract":"We present, in this paper, a dynamic load balancing aJ.gorit.hm developed for Clustered Time Warp, a hybrid approach which makes we of Time Warp bet.wwn clusters of LPs and a sequent.isl mechanism within the clusters. The load balancing algorithm focuses on distributing the load of the sirnnlat.ion evenly among the processors and then tries to reduce interprocesmr communicatiorw We make nse of a triggering techniqne based on the thronghpnt of the sinmlat.ion system. The algorithm was implemented and its performance W-M rneasnred usin two of the largest 8 benchmark digital circnits of the I CAS’89 series. In order t,o measnre the effects of the algorithm on worklm.d distribution, inter-proces..or communication and rollbacks, we defined three dktinct met rim. Results show that by dynamicaJ1y balancing the load, the throughput was improved by 40 to 100% when compared to Time Warp. Thrcmghpnt. is the nnmbw of non rolled-back message events per unit time. It’hen the algorithm tried to reduce inter-processor communication, rollbacks were substantially rwlnced. Nevertheless, no substantial improvement was observed on the overall simulation time, sng Ming that f load distribution is the most important actor to be taken into consideration in speeding up the simulation of d@jt.ai circuits.","PeriodicalId":326232,"journal":{"name":"Proceedings of Symposium on Parallel and Distributed Tools","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1996-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128030281","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Parallel Simulation of Billiard Balls using Shared Variables","authors":"P.A. MacKenzie, C. Tropper","doi":"10.1145/238788.238845","DOIUrl":"https://doi.org/10.1145/238788.238845","url":null,"abstract":"WC: ~rv:w:nt, iTL t}LioV IMqM:r, (L (:()?knc?wih(: ld~(dh?ll j(w tk(: padkl sirnulaiion oj hdliani halls. It G+ cortwwn to employ a qwtial appwaclh to tk cm: .9irrbui4Ltiow, in uJLick tkc billiard tahlc ti partiiioncd inta 8cgrIL(:rbts wkick an: nimulat cd by diflc77:rkt prw:w.ww. It diffcrx jrwrrl prvviOw all~T7MdL (:8 i?b t~LILt it rnakcs lb.q c Oj 8k(Ld U(Lri(LkiC/T to cnakl(: prw:cmww to a.wnfain th: stat c of A: wrrqmtat ion al nci@oring y wn:cwow. Tkc skarwl uariahlc ctH7Y:qMmd.9 t{) a 7w@4)?b at .& !Mwb&y Oj i?k(: tdd(: WJ?lLC7bt# (tiL(: 8(~callcd critical rwgio7b ). By rnaki?q uw of JAarcd vtLrialdcN a nigrbi~carbt qmctl-wp 07Jcr tkc mccutifm tirac Oj a prbrcly C07L8CTWLti 11(: (L~)lWOdL i.q 0ht(Li7L[:d. TIM ulgoritkm wan iruI)[crrb [:ntcd on IL BBN Butterfly, W+ WILY a Purdy corb.vcrrlatiuc ahpitkrrb. In t}w co?wcr’uai?iw: aigolitk, a pwn:cwwr uLAi7tg to prww~ a hall in tk:critical n:gion waitn urbtil kk[! 7b[:i@07i7b(J 117W[:,WIW8 .’4i?ILlLktiOlb ti71LC h CJTr!lLt(:r i}LC7b tlk(: tiwc of tkc cf)c7bt it 7AA:8 to pwcc.w. 17b fnLr czp(:rivll(:?bh+, wc (:X(L71LiTk(:d tklW pI)])lbhLtif)?b h:ul:~Y of iMLii.Y-~~ ~(), 48V0 and ~~~~. TILWJCpop,dations wcw CLONCV, to rwfh:t (L low, rrdiwrrb a71d kiyh population of Mix TILC ,dLan:dua7ialdc approaciL 7wdtcd in a 30 to50pwc(:rbt dccn:aw in tltcz(:cdiorb ti7rb(: oj tibc pun:ly co7hw:vwdiw a]lp17ML(:}L.","PeriodicalId":326232,"journal":{"name":"Proceedings of Symposium on Parallel and Distributed Tools","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1996-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130221173","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Discrete-Event Simulation and the Event Horizon Part 2: Event List Management","authors":"J. Steinman","doi":"10.1145/238788.238841","DOIUrl":"https://doi.org/10.1145/238788.238841","url":null,"abstract":"The event horizon is a very important concept that applies to both parallel and sequential discrete-event simulations. By exploiting the event horizon, parallel simulations can processes events optimistically in a risk-free manner (i.e., without requiring antimessages) using adaptable \"breathing\" time cycles with variable time widths. Additionally, exploiting the event horizon can significantly reduce the overhead of event list management that is common to virtually all discrete-event simulations. This paper is a continuation of work previously reported at PADS94. In that report, a complete mathematical formulation of the event horizon was derived under equilibrium conditions using the hold model. Various forms of the beta density function were consequently used to verify the predicted results of the analytic model. This second report describes how the concept of the event horizon can also be applied to event list management. By exploiting the event horizon, the performance of several priority queue data structures are improved including: linked lists, various binary trees, and heaps. A somewhat detailed description of these modified data structures along with other relevant background information is provided for completeness. Performance results for each of these priority queue data structure is provided.","PeriodicalId":326232,"journal":{"name":"Proceedings of Symposium on Parallel and Distributed Tools","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1996-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117020177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Reducing Synchronization Overhead in Parallel Simulation","authors":"Ulana Legedza, W. Weihl","doi":"10.1145/238788.238822","DOIUrl":"https://doi.org/10.1145/238788.238822","url":null,"abstract":"Synchronization is often the dominant cost in conservative parallel simulation, particularly in simulations of parallel computers, in which low-latency simulated communication requires frequent synchronization. We present and evaluate LOCAL BARRIERS and PREDICTIVE BARRIER SCHEDULING, two techniques for reducing synchronization overhead in the simulation of message-passing multicomputers. Local barriers use nearest-neighbor synchronization to reduce waiting time at synchronization points. Predictive barrier scheduling, a novel technique that schedules synchronizations using both compile-time and runtime analysis, reduces the frequency of synchronization operations. In contrast to other work in this area, both techniques reduce synchronization overhead without decreasing the accuracy of network simulation. These techniques were evaluated by comparing their performance to that of periodic global synchronization. Experiments show that local barriers improve performance by up to 24% for communication-bound applications, while predictive barrier scheduling improves performance by up to 65% for applications with long local computation phases. Because the two techniques are complementary, we advocate a combined approach. This work was done in the context of PARALLEL PROTEUS, a new parallel simulator of message-passing multicomputers.","PeriodicalId":326232,"journal":{"name":"Proceedings of Symposium on Parallel and Distributed Tools","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1996-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126341361","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
R. Bagrodia, Yuan Chen, M. Gerla, B. Kwan, Jay Martin, P. Palnati, S. Walton
{"title":"Parallel Simulation of a High-Speed Wormhole Routing Network","authors":"R. Bagrodia, Yuan Chen, M. Gerla, B. Kwan, Jay Martin, P. Palnati, S. Walton","doi":"10.1145/238788.238813","DOIUrl":"https://doi.org/10.1145/238788.238813","url":null,"abstract":"A flexible simulator has been developed to simulate a two-level metropolitan area network which uses wormhole routing. To accurately model the nature of wormhole routing, the simulator performs discretebyte rather than discrete-packet simulation. Despite the increased computational workload that this implies, it has been possible to create a simulator with acceptable performance by writing it in Maisie, a parallel discrete-event simulation language. The simulator provides an accurate model of an actual high-speed, source-routing, wormhole network (the Myrinet) and is the first such simulator. The paper describes the simulator and reports on the performance of parallel implementations of the simulator on a 24-node IBM SP 2 multicomputer. The parallel implementations yielded reasonable speedups. For instance, on 12 nodes, the conservative algorithm yielded a speed-up of about 6 whereas an optimistic algorithm yielded a speed-up of about 4.","PeriodicalId":326232,"journal":{"name":"Proceedings of Symposium on Parallel and Distributed Tools","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1996-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131750935","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hierarchical Strategy of Model Partitioning for VLSI-Design Using an Improved Mixture of Experts Approach","authors":"K. Hering, R. Haupt, T. Villmann","doi":"10.1145/238788.238825","DOIUrl":"https://doi.org/10.1145/238788.238825","url":null,"abstract":"The partitioning of complex processor models on the gate and register-transfer level for parallel functional simulation based on the clock-cycle algorithm is considered. We introduce a hierarchical partitioning scheme combining various partitioning algorithms in the frame of a competing strategy. Melting together different partitioning results within one level using superpositions we crossover to a mixture of experts one. This approach is improved applying genetic algorithms. In addition we present two new partitioning algorithms both of them taking cones as fundamental units for building partitions.","PeriodicalId":326232,"journal":{"name":"Proceedings of Symposium on Parallel and Distributed Tools","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1996-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115933206","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
R. Rönngren, M. Liljenstam, R. Ayani, J. Montagnat
{"title":"Transparent Incremental State Saving in Time Warp Parallel Discrete Event Simulation","authors":"R. Rönngren, M. Liljenstam, R. Ayani, J. Montagnat","doi":"10.1145/238788.238818","DOIUrl":"https://doi.org/10.1145/238788.238818","url":null,"abstract":"Many systems rely on the ability to rollback (or restore) parts of the system state to undo or recover from undesired or erroneous computations. Examples of such systems include fault tolerant systems with checkpointing, editors with undo capabilities, transaction and data base systems and optimistically synchronized parallel and distributed simulations. An essential part of such systems is the state saving mechanism. It should not only allow efficient state saving, but also support efficient state restoration in case of roll back. Furthermore, it is often a requirement that this mechanism is transparent to the user. In this paper we present a method to implement a transparent incremental state saving mechanism in an optimistically synchronized parallel discrete event simulation system based on the Time Warp mechanism. The usefulness of this approach is demonstrated by simulations of large, detailed, realistic FCA and a DCA-like cellular phone systems.","PeriodicalId":326232,"journal":{"name":"Proceedings of Symposium on Parallel and Distributed Tools","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1996-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133145786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Design of High Level Modelling / High Performance Simulation Environments","authors":"B. Zeigler, Doohwan Kim","doi":"10.1145/238788.238839","DOIUrl":"https://doi.org/10.1145/238788.238839","url":null,"abstract":"Advances in massively parallel platforms are increasing the prospects for high performance discrete event simulation. Still the difficulty in parallel programming persists and there is increasing demand for high level support for building discrete event models to execute on such platforms. We present a parallel DEVS-based (Discrete Event System Specification) simulation environment that can execute on distributed memory multicomputer systems with benchmarking results of a class of high resolution, large scale ecosystem models. Underlying the environment is a parallel container class library for hiding the details of message passing technology while providing high level abstractions for hierarchical, modular DEVS models. The C++ implementation working on the Thinking Machines CM-5 demonstrates that the desire for high level modeling support need not be irreconcilable with sustained high performance.","PeriodicalId":326232,"journal":{"name":"Proceedings of Symposium on Parallel and Distributed Tools","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1996-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127074895","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}