{"title":"Time Management in the DoD High Level Architecture","authors":"R. Fujimoto, R. Weatherly","doi":"10.1145/238788.238817","DOIUrl":"https://doi.org/10.1145/238788.238817","url":null,"abstract":"Recently, a considerable amount of effort in the U.S. Department of Defense has been devoted to defining the High Level Architecture (HLA) for distributed simulations. This paper describes the time management component of the HLA that defines the means by which individual simulations (called federates) advance through time. Time management includes synchronization mechanisms to ensure event ordering when this is needed. The principal challenge of the time management structure is to support interoperability among federates using different local time management mechanisms such as that used in DIS, conservative and optimistic mechanisms developed in the parallel simulation community, and real-time hardware-in-the-loop simulations.","PeriodicalId":326232,"journal":{"name":"Proceedings of Symposium on Parallel and Distributed Tools","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115531141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automatic Incremental State Saving","authors":"D. West, K. Panesar","doi":"10.1145/238788.238820","DOIUrl":"https://doi.org/10.1145/238788.238820","url":null,"abstract":"We present an Incremental State Saving technique for which the state saving calls are inserted automatically by directly editing the application executable. This method has the advantage of being easy to use since it is fully automatic, and has good performance since it adds overhead only where state is being modified. Since the editing happens on executable code, the method is independent of the compiler, and allows third party libraries to be used. None of the previous incremental state saving methods have both of these features. We find that it is beneficial to use Automatic Incremental State Saving if less than 15% of the state is modified in each event as compared to copy state saving. This technique allows us to efficiently parallelize existing simulations, and makes Time Warp more accessible to non-Time Warp experts.","PeriodicalId":326232,"journal":{"name":"Proceedings of Symposium on Parallel and Distributed Tools","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116014808","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimistic Simulation of Parallel Architectures Using Program Executables","authors":"S. Chandrasekaran, M. Hill","doi":"10.1145/238788.238838","DOIUrl":"https://doi.org/10.1145/238788.238838","url":null,"abstract":"A key tool of computer architects is computer simulation at the level of detail that can execute program executables. The time and memory requirements of such simulations can be enormous, especially when the machine under design-the target-is a parallel machine. Thus, it is attractive to use parallel simulation, as successfully demonstrated by the Wisconsin Wind Tunnel (WWT). WWT uses a conservative simulation algorithm and eschews network simulation to make lookahead adequate. Nevertheless, we find most of WWT's slowdown to be due to the synchronization overhead in the conservative simulation algorithm. This paper examines the use of optimistic algorithms to perform parallel simulations of parallel machines. We first show that we can make optimistic algorithms work correctly even with WWT's direct execution of program executables. We checkpoint processor registers (integer, floating-point, and condition codes) and use executable editing to log the value of memory words just before they are overwritten by stores. Second, we consider the performance of two optimistic algorithms. The first executes programs optimistically, but performs protocol events (e.g., sending messages) conservatively. The second executes everything optimistically and is similar to Time Warp with lazy message cancellation. Unfortunately, both approaches make parallel simulation performance worse for the default WWT assumptions. We conclude by speculating on the performance of optimistic simulation when simulating (1) target network details, and (2) on hosts with high message latencies and no synchronization hardware.","PeriodicalId":326232,"journal":{"name":"Proceedings of Symposium on Parallel and Distributed Tools","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130904594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Concurrency Preserving Partitioning (CPP) for Parallel Logic Simulation","authors":"Hong-Kyu Kim, Jack S. N. Jean","doi":"10.1145/238788.238823","DOIUrl":"https://doi.org/10.1145/238788.238823","url":null,"abstract":"Based on a linear ordering of vertices in a directed graph, a linear-time partitioning algorithm for parallel logic simulation is presented. Unlike most other partitioning algorithms, the proposed algorithm preserves circuit concurrency by assigning to processors circuit gates that can be evaluated at about the same time. As a result, the concurrency preserving partitioning (CPP) algorithm can provide better load balancing throughout the period of a parallel simulation. This is especially important when the algorithm is used together with a Time Warp simulation where a high degree of concurrency can lead to fewer rollbacks and better performance. The algorithm consists of three phases, and three conflicting goals can be separately considered in each phase so to reduce computational complexity. A parallel gate-level circuit simulator is implemented on an Intel Paragon machine to evaluate the performance of the CPP algorithm. The results are compared with two other partitioning algorithms to show that reasonable speedup may be achieved with the algorithm.","PeriodicalId":326232,"journal":{"name":"Proceedings of Symposium on Parallel and Distributed Tools","volume":"1986 7","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120849151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Experiments in Automated Load Balancing","authors":"L. F. Wilson, D. Nicol","doi":"10.1145/238788.238796","DOIUrl":"https://doi.org/10.1145/238788.238796","url":null,"abstract":"One of the promises of parallelized discrete-event simulation is that it might provide significant speedups over sequential simulation. In reality, high performance cannot be achieved unless the system is fine-tuned to balance computation, communication, and synchronization requirements. In this paper, we discuss our experiments in automated load balancing using the SPEEDES simulation framework. Specifically, we examine three mapping algorithms that use run-time measurements. Using simulation models of queuing networks and the National Airspace System, we investigate (i) the use of run-time data to guide mapping, (ii) the utility of considering communication costs in a mapping algorithm, (iii) the degree to which computational ``hot-spots'' ought to be broken up in the linearization, and (iv) the relative execution costs of the different algorithms. We compare the performance of the three algorithms using results from the Intel Paragon.","PeriodicalId":326232,"journal":{"name":"Proceedings of Symposium on Parallel and Distributed Tools","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125307185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Actor Based Parallel VHDL Simulation Using Time Warp","authors":"V. Krishnaswamy, P. Banerjee","doi":"10.1145/238788.238836","DOIUrl":"https://doi.org/10.1145/238788.238836","url":null,"abstract":"One of the methods used to reduce the time spent simulating VHDL designs is by parallelizing the simulation. In this paper, we describe the implementation of an object-oriented Time Warp simulator for VHDL on an actor based environment. The actor model of computation allows the exploitation of fine grained parallelism in a truly asynchronous manner and allows for the overlap of computation with communication. Some preliminary results obtained by simulating a set of multipliers and some ISCAS benchmark circuits are provided. In addition, the importance of placing processes based on circuit partitioning techniques for improving runtimes and scalability is demonstrated. Results are reported on a Sun SPARCServer 1000 and an Intel Paragon.","PeriodicalId":326232,"journal":{"name":"Proceedings of Symposium on Parallel and Distributed Tools","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128150781","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Queueing Models and Stability of Message Flows in Distributed Simulators of Open Queueing Networks","authors":"Manish Gupta, Anurag Kumar, R. Shorey","doi":"10.1145/238788.238840","DOIUrl":"https://doi.org/10.1145/238788.238840","url":null,"abstract":"In this paper we study message flow processes in distributed simulators of open queueing networks. We develop and study queueing models for distributed simulators with maximum lookahead sequencing. We characterize the \"external'' arrival process, and the message feedback process in the simulator of a simple queueing network with feedback. We show that a certain \"natural'' modelling construct for the arrival process is exactly correct, whereas an ``obvious'' model for the feedback process is wrong; we then show how to develop the correct model. Our analysis throws light on the stability of distributed simulators of queueing networks with feedback. We show how the stability of such simulators depends on the parameters of the queueing network.","PeriodicalId":326232,"journal":{"name":"Proceedings of Symposium on Parallel and Distributed Tools","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116439061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Conservative Parallel Simulation of ATM Networks","authors":"J. Cleary, Jya-Jang Tsai","doi":"10.1145/238788.238807","DOIUrl":"https://doi.org/10.1145/238788.238807","url":null,"abstract":"A new Conservative algorithm for both parallel and sequential simulation of networks is described. The technique is motivated by the construction of a high performance simulator for ATM networks. It permits very fast execution of models of ATM systems, both sequentially and in parallel. A simple analysis of the performance of the system is made. Initial performance results from parallel and sequential implementations are presented and compared with comparable results from an optimistic TimeWarp based simulator. It is shown that the conservative simulator performs well when the \"density\" of messages in the simulated system is high, a condition which is likely to hold in many interesting ATM scenarios.","PeriodicalId":326232,"journal":{"name":"Proceedings of Symposium on Parallel and Distributed Tools","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126014285","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Background Execution of Time Warp Programs","authors":"C. Carothers, R. Fujimoto","doi":"10.1109/PADS.1996.761558","DOIUrl":"https://doi.org/10.1109/PADS.1996.761558","url":null,"abstract":"A load distribution system is proposed to enable a single Time War progmm to execute in background, s readang over a colfection of possibly heterogeneous worlstations (including multiprocessor hosts), utilizing whatever otherwise unused CPU cycles are available. The system uses a simple processor allocation policy to dynamicall add or delete hosts from the set of processors utilized b tbe Time Warp progmm during its execution. A load bayancing algorithm as used that allocates logical processes (LPs) to processors, taking into account other computations executing on the host from the system or other user applications. A clustering mechanism is used to group collections of lo ical processes, together, ,reducing process migration overleads and helping to retain locality of communacatzon for simulations containing large number of LPs. An initial, prototy e implementation of the load distribution system is Lscribed that executes on, a homogeneous network of Silicon Gmphics workstatzons. Initial experiments indicate this ap roach shows promise in enabling e cient execution of &me Warp programs ”in backgroung on distributed computing platforms.","PeriodicalId":326232,"journal":{"name":"Proceedings of Symposium on Parallel and Distributed Tools","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126571940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improving Conservative VHDL Simulation Performance by Reduction of Feedback","authors":"J. F. Hurford, T. Hartrum","doi":"10.1145/238788.238847","DOIUrl":"https://doi.org/10.1145/238788.238847","url":null,"abstract":"This paper describes two forms of feedback in the simulation runtime of VHDL circuits that greatly influences performance. While circuit feedback and strongly connected components have been observed and documented as detrimental influences to conservative parallel discrete event simulation (PDES) efficiency, that influence has never been quantified. Moreover, in this study, the phenomenon of induced feedback was observed to diminish speedup to the same degree as explicit feedback. In this paper the influence of feedback on simulation runtime is analyzed and an algorithm for its elimination is presented. In addition, a metric for the quantification of feedback is introduced. By measuring feedback, it is possible to balance its influence on simulation runtime with that of other factors (e.g. load balance, number of processors, machine granularity, etc. ) through the use of a cost-based partitioning approach. This paper reports significant improvements in runtime for three circuits due to the prevention of feedback using the partitioning algorithm presented. In addition, strong correlation between the feedback metric and conservative parallel simulation overhead is demonstrated.","PeriodicalId":326232,"journal":{"name":"Proceedings of Symposium on Parallel and Distributed Tools","volume":"185 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115884301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}