{"title":"Enabling Distributed Simulations Using Big Data and Clouds","authors":"P. Kacsuk","doi":"10.1145/2769458.2769485","DOIUrl":"https://doi.org/10.1145/2769458.2769485","url":null,"abstract":"Large scientific simulation projects should enable the collaboration of large scientific consortia where members are located in different countries and even continents storing their usually very large data set in different kind of storages. Therefore state-of-the-art simulations should process very large set of data stored in a distributed way in different kind of storages located in all over the world. As the data is big its processing time can be intolerably long. To reduce processing time we have to use large infrastructure that enables the exploitation of parallel processing wherever it is possible in the simulation process. Clouds provide the required large set of computing resources and hence we need simulation environments that enable the easy exploitation of cloud resources. This keynote speech introduces a cloud-oriented simulation platform that enables the exploitation of large cloud resources as well as accessing all the major data storage types. This platform called as WS-PGRADE/gUSE is intensively used in many EU FP7 projects among them in CloudSME where the main target is to enable particularly small and medium-sized manufacturing and engineering companies (SMEs), to use state of the art simulation technology as a Service (SaaS, one-stop-shop, pay-per-use) in the cloud. In this talk we will show the main features of WS-PGRADE/gUSE that enable the use of cloud and large data resources to conduct distributed simulations. First, the workflow creation and execution mechanism will be explained. Then the DCI Bridge service will be shown that enables the exploitation of many independent cloud resources in parallel. Finally, the Data Avenue service that enables the access and transfer of large data among various types of data storages will be described. These services together enable the creation of simulation workflows that are easily portable among different distributed computing and data infrastructures including various types of clouds and cloud storages. At the end of the talk some concrete examples from the CloudSME project (www.cloudsme.eu) will highlight the main advantages of using the platform.","PeriodicalId":138284,"journal":{"name":"Proceedings of the 3rd ACM SIGSIM Conference on Principles of Advanced Discrete Simulation","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128966835","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Cloning Agent-based Simulation on GPU","authors":"Xiaosong Li, Wentong Cai, S. Turner","doi":"10.1145/2769458.2769470","DOIUrl":"https://doi.org/10.1145/2769458.2769470","url":null,"abstract":"Simulation cloning is an efficient way to analyze multiple configurations in a parameter exploration task. This paper presents a generic approach to perform incremental agent-based simulation cloning and discusses its implementation on GPU. Compared with the incremental cloning of parallel and distributed simulation (PADS), cloning agent-based simulation (ABS) has new challenges due to the unique way how ABS is executed. In this paper, to support incremental cloning, mechanisms for both actively and passively cloning agents are proposed. A scheme to maintain the correct context of each cloned ABS instance is developed. In addition, a strategy to restrain the propagation of passive cloning in order to maximize computation sharing amongst cloned ABS instances is also investigated. The implementation of our proposed approach on GPU supports concurrent execution of agents within each simulation instance as well as concurrent execution of multiple simulation instances. Performance of the proposed approach is evaluated and analyzed using a case study of an agent-based evacuation simulation on a NVIDIA Quadro 2000 GPU. Our experiment results demonstrate that cloning can significantly speed up the overall parameter exploration task. The proposed approach achieves 2.4 to 5.1 times speedup for parameter exploration tasks containing 8 to 125 simulation instances that evaluate different parameter configurations.","PeriodicalId":138284,"journal":{"name":"Proceedings of the 3rd ACM SIGSIM Conference on Principles of Advanced Discrete Simulation","volume":"179 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126767912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"NUMA Time Warp","authors":"Alessandro Pellegrini, F. Quaglia","doi":"10.1145/2769458.2769479","DOIUrl":"https://doi.org/10.1145/2769458.2769479","url":null,"abstract":"It is well known that Time Warp may suffer from large usage of memory, which may hamper the efficiency of the memory hierarchy. To cope with this issue, several approaches have been devised, mostly based on the reduction of the amount of used virtual memory, e.g., by the avoidance of checkpointing and the exploitation of reverse computing. In this article we present an orthogonal solution aimed at optimizing the latency for memory access operations when running Time Warp systems on Non-Uniform Memory Access (NUMA) multi-processor/multi-core computing systems. More in detail, we provide an innovative Linux-based architecture allowing per simulation-object management of memory segments made up by disjoint sets of pages, and supporting both static and dynamic binding of the memory pages reserved for an individual object to the different NUMA nodes, depending on what worker thread is in charge of running that simulation object along a given wall-clock-time window. Our proposal not only manages the virtual pages used for the live state image of the simulation object, rather, it also copes with memory pages destined to keep the simulation object's event buffers and any recoverability data. Further, the architecture allows memory access optimization for data (messages) exchanged across the different simulation objects running on the NUMA machine. Our proposal is fully transparent to the application code, thus operating in a seamless manner. Also, a free software release of our NUMA memory manager for Time Warp has been made available within the open source ROOT-Sim simulation platform. Experimental data for an assessment of our innovative proposal are also provided in this article.","PeriodicalId":138284,"journal":{"name":"Proceedings of the 3rd ACM SIGSIM Conference on Principles of Advanced Discrete Simulation","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121744967","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Experiments with Hardware-based Transactional Memory in Parallel Simulation","authors":"Joshua Hay, P. Wilsey","doi":"10.1145/2769458.2769462","DOIUrl":"https://doi.org/10.1145/2769458.2769462","url":null,"abstract":"Transactional memory is a concurrency control mechanism that dynamically determines when threads may safely execute critical sections of code. It provides the performance of fine-grained locking mechanisms with the simplicity of coarse-grained locking mechanisms. With hardware based transactions, the protection of shared data accesses and updates can be evaluated at runtime so that only true collisions to shared data force serialization. This paper explores the use of transactional memory as an alternative to conventional synchronization mechanisms for managing the pending event set in a Time Warp synchronized parallel simulator. In particular, we explore the application of Intel's hardware-based transactional memory (TSX) to manage shared access to the pending event set by the simulation threads. Comparison between conventional locking mechanisms and transactional memory access is performed to evaluate each within the warped Time Warp synchronized parallel simulation kernel. In this testing, evaluation of both forms of transactional memory found in the Intel Haswell processor, Hardware Lock Elision (HLE) and Restricted Transactional Memory (RTM), are evaluated. The results show that RTM generally outperforms conventional locking mechanisms and that HLE provides consistently better performance than conventional locking mechanisms, in some cases as much as 27%.","PeriodicalId":138284,"journal":{"name":"Proceedings of the 3rd ACM SIGSIM Conference on Principles of Advanced Discrete Simulation","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128022488","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Session details: Biological Systems","authors":"N. Mustafee","doi":"10.1145/3247425","DOIUrl":"https://doi.org/10.1145/3247425","url":null,"abstract":"","PeriodicalId":138284,"journal":{"name":"Proceedings of the 3rd ACM SIGSIM Conference on Principles of Advanced Discrete Simulation","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127001134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Knowledge Discovery in Manufacturing Simulations","authors":"N. Feldkamp, S. Bergmann, S. Strassburger","doi":"10.1145/2769458.2769468","DOIUrl":"https://doi.org/10.1145/2769458.2769468","url":null,"abstract":"Discrete event simulation studies in a manufacturing context are a powerful instrument when modeling and evaluating processes of various industries. Usually simulation experts conduct simulation experiments for a predetermined system specification by manually varying parameters through educated assumptions and according to a prior defined goal. Moreover, simulation experts try to reduce complexity and number of simulation runs by excluding parameters that they consider as not influential regarding the simulation project scope. On the other hand, today's world of big data technology enables us to handle huge amounts of data. We therefore investigate the potential benefits of designing large scale experiments with a much broader coverage of possible system behavior. In this paper, we propose an approach for applying data mining methods on simulation data in combination with suitable visualization methods in order to uncover relationships in model behavior to discover knowledge that otherwise would have remained hidden. For a prototypical demonstration we used a clustering algorithm to divide large amounts of simulation output datasets into groups of similar performance values and depict those groups through visualizations to conduct a visual investigation process of the simulation data.","PeriodicalId":138284,"journal":{"name":"Proceedings of the 3rd ACM SIGSIM Conference on Principles of Advanced Discrete Simulation","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124468555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Virtual Time System for Linux-container-based Emulation of Software-defined Networks","authors":"Jiaqi Yan, Dong Jin","doi":"10.1145/2769458.2769480","DOIUrl":"https://doi.org/10.1145/2769458.2769480","url":null,"abstract":"Realistic and scalable testing systems are critical to evaluate network applications and protocols to ensure successful real system deployments. Container-based network emulation is attractive because of the combination of many desired features of network simulators and physical testbeds. The success of Mininet, a popular software-defined networking (SDN) emulation testbed, demonstrates the value of such approach that we can execute unmodified binary code on a large-scale emulated network with lightweight OS-level virtualization techniques. However, an ordinary network emulator uses the system clock across all the containers even if a container is not being scheduled to run. This leads to the issue of temporal fidelity, especially with high workloads. Virtual time sheds the light on the issue of preserving temporal fidelity for large-scale emulation. The key insight is to trade time with system resources via precisely scaling the time of interactions between containers and physical devices by a factor of n, hence, making an emulated network appear to be n times faster from the viewpoints of applications in the container. In this paper, we develop a lightweight Linux-container-based virtual time system and integrate the system to Mininet for fidelity and scalability enhancement. We also design an adaptive time dilation scheduling module for balancing speed and accuracy. Experimental results demonstrate that (1) with virtual time, Mininet is able to accurately emulate a network n times larger in scale, where n is the scaling factor, with the system behaviors closely match data obtained from a physical testbed; and (2) with the adaptive time dilation scheduling, we reduce the running time by 46% with little accuracy loss. Finally, we present a case study using the virtual-time-enabled Mininet to evaluate the limitations of equal-cost multi-path (ECMP) routing in a data center network.","PeriodicalId":138284,"journal":{"name":"Proceedings of the 3rd ACM SIGSIM Conference on Principles of Advanced Discrete Simulation","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125357694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Session details: Manufacturing Applications","authors":"R. Fujimoto","doi":"10.1145/3247419","DOIUrl":"https://doi.org/10.1145/3247419","url":null,"abstract":"","PeriodicalId":138284,"journal":{"name":"Proceedings of the 3rd ACM SIGSIM Conference on Principles of Advanced Discrete Simulation","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126534187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Towards a DEVS-based Operating System","authors":"Daniella Niyonkuru, Gabriel A. Wainer","doi":"10.1145/2769458.2769465","DOIUrl":"https://doi.org/10.1145/2769458.2769465","url":null,"abstract":"Embedded systems are becoming increasingly complex and heterogeneous. Formal methods have proven effective in ensuring reliability and safety. However, they are hard to scale up. Modeling and Simulation (M&S)-based methods, on the other hand, deal effectively with scalability issues and provide the benefits of a risk-free testing environment. Yet, they are usually at most semi-formal, and models are not directly executed on the target hardware. To address the above challenges, we present a formal M&S-based kernel that runs on bare-metal and execute the original simulation models on the target hardware.","PeriodicalId":138284,"journal":{"name":"Proceedings of the 3rd ACM SIGSIM Conference on Principles of Advanced Discrete Simulation","volume":"259 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122615135","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improving Accuracy and Performance Through Automatic Model Generation for Gate-Level Circuit PDES with Reverse Computation","authors":"Elsa Gonsiorowski, Justin M. LaPre, C. Carothers","doi":"10.1145/2769458.2769463","DOIUrl":"https://doi.org/10.1145/2769458.2769463","url":null,"abstract":"Gate-level circuit simulation is an important step in the design and validation of complex circuits. This step of the process relies on existing libraries for gate specifications. We start with a generic gate model for Rensselaer's Optimistic Simulation System (ROSS), a parallel discrete-event simulation framework. This generic model encompasses all functionality needed by optimistic simulation using reverse computation. We then describe a parser system which uses a standardized gate library to create a specific model for simulation. The generated model is comprised of several function including those needed for an accurate model of timing behavior. To quantify the improvements that an automatically generated model can have over a hand written model we compare two gate library models: an automatically generated LSI-10K library model and a previously investigated, handwritten, simplified GTECH library model. We conclude that the automatically generated model is a more accurate model of actual hardware. The generated model also represents the timing behavior with an approximately 50 times higher degree of fidelity. In comparison to previous results, we find that the automatically generated model is able to achieve better optimistic simulation performance when measured against conservative simulation. We identify peak optimistic performance when using 128 MPI-Ranks on eight nodes of an IBM Blue Gene/Q machine.","PeriodicalId":138284,"journal":{"name":"Proceedings of the 3rd ACM SIGSIM Conference on Principles of Advanced Discrete Simulation","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115175382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}