Huiwei Lv, Yuan Cheng, Lu Bai, Mingyu Chen, Dongrui Fan, Ninghui Sun
{"title":"P-GAS: Parallelizing a Cycle-Accurate Event-Driven Many-Core Processor Simulator Using Parallel Discrete Event Simulation","authors":"Huiwei Lv, Yuan Cheng, Lu Bai, Mingyu Chen, Dongrui Fan, Ninghui Sun","doi":"10.1109/PADS.2010.5471655","DOIUrl":"https://doi.org/10.1109/PADS.2010.5471655","url":null,"abstract":"Multi-core processors are commonly available now, but most traditional computer architectural simulators still use single-thread execution. In this paper we use parallel discrete event simulation (PDES) to speedup a cycle-accurate event-driven many-core processor simulator. Evaluation against the sequential version shows that the parallelized one achieves an average speedup of 10.9× (up to 13.6×) running SPLASH-2 kernel on a 16-core host machine, with cycle counter differences of less than 0.1%. Moreover, super-linear speedups are achieved between running 1 thread and 8 threads due to reduced overhead of insert-event-to-queue time and increased cache size in parallel processing. We conclude that PDES could be an attractive option for achieving fast cycle-accurate many-core processor simulations.","PeriodicalId":388814,"journal":{"name":"2010 IEEE Workshop on Principles of Advanced and Distributed Simulation","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131299274","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Methodology to Predict the Performance of Distributed Simulations","authors":"D. Gianni, G. Iazeolla, A. D’Ambrogio","doi":"10.1109/PADS.2010.5471669","DOIUrl":"https://doi.org/10.1109/PADS.2010.5471669","url":null,"abstract":"Predicting the time-performance of a Distributed Simulation (DS) system may be of interest to evaluate system alternatives during the development cycle, before the system is implemented. In this paper, we introduce a methodology to predict the execution time of a DS system during its design phase. The methodology is based on a model-building approach that, basing on the design documents of the DS system, first produces its performance model and then evaluates it. The model includes components such as middleware to use (e.g., the HLA RTI), the set of DS execution hosts and the set of host interconnection networks. The methodology is applied to determine whether or not producing the distributed simulator of a given system may be advantageous in terms of execution time with respect to a conventional local simulator. An example use of the methodology is presented and validated by a comparison of the time-prediction with the actual execution time of the implemented DS system.","PeriodicalId":388814,"journal":{"name":"2010 IEEE Workshop on Principles of Advanced and Distributed Simulation","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131532372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Multi-State Q-Learning Approach for the Dynamic Load Balancing of Time Warp","authors":"S. Meraji, Wei Zhang, C. Tropper","doi":"10.1109/PADS.2010.5471661","DOIUrl":"https://doi.org/10.1109/PADS.2010.5471661","url":null,"abstract":"In this paper, we present a dynamic load-balancing algorithm for optimistic gate level simulation making use of a machine learning approach. We first introduce two dynamic load-balancing algorithms oriented towards balancing the computational and communication load respectively in a Time Warp simulator. In addition, we utilize a multi- state Q-learning approach to create an algorithm which is a combination of the first two algorithms. The Q-learning algorithm determines the value of three important parameters- the number of processors which participate in the algorithm, the load which is exchanged during its execution and the type of load-balancing algorithm. We investigate the algorithm on gate level simulations of several open source VLSI circuits.","PeriodicalId":388814,"journal":{"name":"2010 IEEE Workshop on Principles of Advanced and Distributed Simulation","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115024914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ya-Lin Huang, C. Alexopoulos, M. Hunter, R. Fujimoto
{"title":"Ad Hoc Distributed Simulation of Queueing Networks","authors":"Ya-Lin Huang, C. Alexopoulos, M. Hunter, R. Fujimoto","doi":"10.1109/PADS.2010.5471650","DOIUrl":"https://doi.org/10.1109/PADS.2010.5471650","url":null,"abstract":"Ad hoc distributed simulation is an approach to predict future states of operational systems. It is based on embedding on-line simulations into a sensor network and adding communication and synchronization among the simulators. While prior work focused on this approach in the context of online management of transportation systems, this paper describes a generalization of the method and shows how it can be applied to embedded simulation of systems that can be modeled as a network of queues. An implementation of an ad hoc queueing network simulation is described. The flows of units across links connecting nodes in different simulations are approximated by renewal processes whose parameters are updated dynamically. The synchronization mechanism uses random sampling to update flow rates across simulations. Preliminary results show that the ad hoc queueing network simulation can provide predictions comparable to sequential simulations.","PeriodicalId":388814,"journal":{"name":"2010 IEEE Workshop on Principles of Advanced and Distributed Simulation","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126883313","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
T. Duong, Suiping Zhou, Wentong Cai, Xueyan Tang, R. Ayani
{"title":"QoS-Aware Server Provisioning for Large-Scale Distributed Virtual Environments","authors":"T. Duong, Suiping Zhou, Wentong Cai, Xueyan Tang, R. Ayani","doi":"10.1109/PADS.2010.5471667","DOIUrl":"https://doi.org/10.1109/PADS.2010.5471667","url":null,"abstract":"Maintaining interactivity is one of the key challenges in distributed virtual environments (DVE) due to the large, heterogeneous Internet latency and the fact that clients in a DVE are usually geographically separated. Previous work in this area have dealt with optimizing interactivity performance given limited server resource. In this paper, we consider a new problem, termed the performance-constrained server provisioning, whose goal is to minimize the resource needed to achieve a pre-determined level of Quality of Service (QoS). We identify and formulate two variants of this new problem and show that they are both NP-hard via reductions to the set covering problem. We also propose several computationally efficient approximation algorithms for solving the problem. Via extensive simulation study, we show that the newly proposed algorithms that take into account inter-server dependencies significantly outperform the well-known set covering algorithm for both problem variants.","PeriodicalId":388814,"journal":{"name":"2010 IEEE Workshop on Principles of Advanced and Distributed Simulation","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130702414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploring Multi-Grained Parallelism in Compute-Intensive DEVS Simulations","authors":"Qi Liu, Gabriel A. Wainer","doi":"10.1109/PADS.2010.5471652","DOIUrl":"https://doi.org/10.1109/PADS.2010.5471652","url":null,"abstract":"We propose a computing technique for efficient parallel simulation of compute-intensive DEVS models on the IBM Cell processor, combining multi-grained parallelism and various optimizations to speed up the event execution. Unlike most existing parallelization strategies, our approach explicitly exploits the massive fine-grained event-level parallelism inherent in the simulation process, while most of the logical processes are virtualized, making the achievable parallelism more deterministic and predictable. Together, the parallelization and optimization strategies produced promising experimental results, accelerating the simulation of a 3D environmental model by a factor of up to 33.06. The proposed methods can also be applied to other multicore and shared-memory architectures.","PeriodicalId":388814,"journal":{"name":"2010 IEEE Workshop on Principles of Advanced and Distributed Simulation","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134643488","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Selecting Simulation Algorithm Portfolios by Genetic Algorithms","authors":"Roland Ewald, Rene Schulz, A. Uhrmacher","doi":"10.1109/PADS.2010.5471673","DOIUrl":"https://doi.org/10.1109/PADS.2010.5471673","url":null,"abstract":"An algorithm portfolio is a set of algorithms that are bundled together for increased overall performance. While being mostly applied to computationally hard problems so far, we investigate portfolio selection for simulation algorithms and focus on their application to adaptive simulation replication. Since the portfolio selection problem is itself hard to solve, we introduce a genetic algorithm to select the most promising portfolios from large sets of simulation algorithms. The effectiveness of this mechanism is evaluated by data from both a realistic performance study and a dedicated test environment.","PeriodicalId":388814,"journal":{"name":"2010 IEEE Workshop on Principles of Advanced and Distributed Simulation","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129754079","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}