{"title":"A Numerical Algorithm for the Decomposition of Cooperating Structured Markov Processes","authors":"A. Marin, S. R. Bulò, S. Balsamo","doi":"10.1109/MASCOTS.2012.52","DOIUrl":"https://doi.org/10.1109/MASCOTS.2012.52","url":null,"abstract":"Modern computer systems consist of a large number of dynamic hardware and software components that interact according to some specific rules. Quantitative models of such systems are important for performance engineering because they allow for an earlier prediction of the quality of service. The application of stochastic modelling for this purpose is limited by the problem of the explosion of the state space of the model, i.e. the number of states that should be considered for an exact analysis increases exponentially and is thus huge even when few components are considered. In this paper we resort to product-form theory to deal with this problem. We define an iterative algorithm with the following characteristics: a) it deals with models with infinite state space and block regular structure (e.g. quasi-birth&death) without the need of truncation; b) in case of detections of product-form according to RCAT conditions, it computes the exact solution of the model; c) in case of non-product-form, it computes an approximate solution. The very loose assumptions allow us to provide examples of analysis of heterogeneous product-form models (e.g., consisting of queues with catastrophes and/or batch removals) as well as approximating non-product-form models with non-exponential service time distributions and negative customers.","PeriodicalId":278764,"journal":{"name":"2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114556175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Aleksandr Khasymski, M. M. Rafique, A. Butt, Sudharshan S. Vazhkudai, Dimitrios S. Nikolopoulos
{"title":"On the Use of GPUs in Realizing Cost-Effective Distributed RAID","authors":"Aleksandr Khasymski, M. M. Rafique, A. Butt, Sudharshan S. Vazhkudai, Dimitrios S. Nikolopoulos","doi":"10.1109/MASCOTS.2012.59","DOIUrl":"https://doi.org/10.1109/MASCOTS.2012.59","url":null,"abstract":"The exponential growth in user and application data entails new means for providing fault tolerance and protection against data loss. High Performance Computing (HPC) storage systems, which are at the forefront of handling the data deluge, typically employ hardware RAID at the backend. However, such solutions are costly, do not ensure end-to-end data integrity, and can become a bottleneck during data reconstruction. In this paper, we design an innovative solution to achieve a flexible, fault-tolerant, and high-performance RAID-6 solution for a parallel file system (PFS). Our system utilizes low-cost, strategically placed GPUs - both on the client and server sides - to accelerate parity computation. In contrast to hardware-based approaches, we provide full control over the size, length and location of a RAID array on a per file basis, end-to-end data integrity checking, and parallelization of RAID array reconstruction. We have deployed our system in conjunction with the widely-used Lustre PFS, and show that our approach is feasible and imposes acceptable overhead.","PeriodicalId":278764,"journal":{"name":"2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128042748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Scalable Algorithm for Placement of Virtual Clusters in Large Data Centers","authors":"A. Tantawi","doi":"10.1109/MASCOTS.2012.11","DOIUrl":"https://doi.org/10.1109/MASCOTS.2012.11","url":null,"abstract":"We consider the problem of placing virtual clusters, each consisting of a set of heterogeneous virtual machines (VM) with some interrelationships due to communication needs and other dependability-induced constraints, onto physical machines (PM) in a large data center. The placement of such constrained, networked virtual clusters, including compute, storage, and networking resources is challenging. The size of the problem forces one to resort to approximate and heuristics-based optimization techniques. We introduce a statistical approach based on importance sampling (also known as cross-entropy) to solve this placement problem. A straightforward implementation of such a technique proves inefficient. We considerably enhance the method by biasing the sampling process to incorporate communication needs and other constraints of requests to yield an efficient algorithm that is linear in the size of the data center. We investigate the quality of the results of using our algorithm on a simulated system, where we study the effects of various parameters on the solution and performance of the algorithm.","PeriodicalId":278764,"journal":{"name":"2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125533011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Comparing the ns-3 Propagation Models","authors":"Mirko Stoffers, G. Riley","doi":"10.1109/MASCOTS.2012.17","DOIUrl":"https://doi.org/10.1109/MASCOTS.2012.17","url":null,"abstract":"An important aspect of any network simulation that models wireless networks is the design and implementation of the Propagation Loss Model. The propagation loss model is used to determine the wireless signal strength at the set of receivers for any packet being transmitted by a single transmitter. There are a number of different ways to model this phenomenon, and these vary both in terms of computational complexity and in the measured performance of the wireless network being modeled. In fact, the ns -- 3 simulator presently has 11 different loss models included in the simulator library. We performed a detailed study of these models, comparing their overall performance both in terms of the computational complexity of the algorithms, as well as the measured performance of the wireless network being simulated. The results of these simulation experiments are reported and discussed. Not surprisingly, we observed considerable variation in both metrics.","PeriodicalId":278764,"journal":{"name":"2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129637859","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Scheduling in Flash-Based Solid-State Drives - Performance Modeling and Optimization","authors":"W. Bux, Xiao-Yu Hu, I. Iliadis, R. Haas","doi":"10.1109/MASCOTS.2012.58","DOIUrl":"https://doi.org/10.1109/MASCOTS.2012.58","url":null,"abstract":"In this paper, we study the performance of solid-state drives that employ flash technology as storage medium. Our prime objective is to understand how the scheduling of the user-generated read and write commands and the read, write, and erase operations induced by the garbage-collection process affect the basic performance measures throughput and latency. We demonstrate that the most straightforward scheduling that prioritizes the processing of garbage-collection-related commands over user-related commands suffers from severe latency deficiencies. These problems can be overcome by using a more sophisticated priority scheme that minimizes the user-perceived latency without throughput penalty or deadlock exposure. Using both analysis and simulation, we investigate how these schemes perform under a variety of system design parameters and workloads. Our results can be directly applied to the engineering of a performance-optimized solid-state-drive system.","PeriodicalId":278764,"journal":{"name":"2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133852309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Diego Rughetti, P. D. Sanzo, B. Ciciani, F. Quaglia
{"title":"Machine Learning-Based Self-Adjusting Concurrency in Software Transactional Memory Systems","authors":"Diego Rughetti, P. D. Sanzo, B. Ciciani, F. Quaglia","doi":"10.1109/MASCOTS.2012.40","DOIUrl":"https://doi.org/10.1109/MASCOTS.2012.40","url":null,"abstract":"One of the problems of Software-Transactional-Memory (STM) systems is the performance degradation that can be experienced when applications run with a non-optimal concurrency level, namely number of concurrent threads. When this level is too high a loss of performance may occur due to excessive data contention and consequent transaction aborts. Conversely, if concurrency is too low, the performance may be penalized due to limitation of both parallelism and exploitation of available resources. In this paper we propose a machine-learning based approach which enables STM systems to predict their performance as a function of the number of concurrent threads in order to dynamically select the optimal concurrency level during the whole lifetime of the application. In our approach, the STM is coupled with a neural network and an on-line control algorithm that activates or deactivates application threads in order to maximize performance via the selection of the most adequate concurrency level, as a function of the current data access profile. A real implementation of our proposal within the TinySTM open-source package and an experimental study relying on the STAMP benchmark suite are also presented. The experimental data confirm how our self-adjusting concurrency scheme constantly provides optimal performance, thus avoiding performance loss phases caused by non-suited selection of the amount of concurrent threads and associated with the above depicted phenomena.","PeriodicalId":278764,"journal":{"name":"2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129504140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Solving the TCP-Incast Problem with Application-Level Scheduling","authors":"Maxim Podlesny, C. Williamson","doi":"10.1109/MASCOTS.2012.21","DOIUrl":"https://doi.org/10.1109/MASCOTS.2012.21","url":null,"abstract":"Data center networks are characterized by high link speeds, low propagation delays, small switch buffers, and temporally clustered arrivals of many concurrent TCP flows fulfilling data transfer requests. However, the combination of these features can lead to transient buffer overflow and bursty packet losses, which in turn lead to TCP retransmission timeouts that degrade the performance of short-lived flows. This so-called TCP-incast problem can cause TCP throughput collapse. In this paper, we explore an application-level approach for solving this problem. The key idea of our solution is to coordinate the scheduling of short-lived TCP flows so that no data loss happens. We develop a mathematical model of lossless data transmission, and estimate the maximum good put achievable in data center networks. The results indicate non-monotonic good put that is highly sensitive to specific parameter configurations in the data center network. We validate our model using ns-2 network simulations, which show good correspondence with the theoretical results.","PeriodicalId":278764,"journal":{"name":"2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"136 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131377384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hop Distance Analysis in Partially Connected Wireless Sensor Networks","authors":"Yun Wang, Brendan M. Kelly, Aimin Zhou","doi":"10.1109/MASCOTS.2012.28","DOIUrl":"https://doi.org/10.1109/MASCOTS.2012.28","url":null,"abstract":"Network connectivity, as a fundamental issue in a wireless sensor network(WSN), has been receiving considerable attention during the past decade. Most works focused on how to maintain full connectivity while conserving network resources. However, full connectivity is actually a sufficient but not necessary condition for many WSNs to communicate and function successfully. In addition, full connectivity requires high-demand in network cost as more sensors will be needed. Further, it is subject to high energy consumption and communication interference as higher communication power might be needed to connect the most isolated sensors. In view of this, this work investigates the hop distance in a randomly deployed WSN with partial network connectivity through modeling, analysis, and simulation perspectives. The results help in selecting critical network parameters for practical WSN designs of diverse WSN applications.","PeriodicalId":278764,"journal":{"name":"2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114326666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"H-SWD: Incorporating Hot Data Identification into Shingled Write Disks","authors":"Chung-I Lin, Dongchul Park, Weiping He, D. Du","doi":"10.1109/MASCOTS.2012.44","DOIUrl":"https://doi.org/10.1109/MASCOTS.2012.44","url":null,"abstract":"Shingled write disk (SWD) is a magnetic hard disk drive that adopts the shingled magnetic recording (SMR) technology to overcome the areal density limit faced in conventional hard disk drives (HDDs). The SMR design enables SWDs to achieve two to three times higher areal density than the HDDs can reach, but it also makes SWDs unable to support random writes/in-place updates with no performance penalty. In particular, a SWD needs to concern about the random write/update interference, which indicates writing to one track overwrites the data previously stored on the subsequent tracks. Some research has been proposed to serve random write/update out-of-place to alleviate the performance degradation at the cost of bringing in the concept of garbage collection. However, none of these studies investigate SWDs based on the garbage collection performance. In this paper, we propose a SWD design called Hot data identification-based Shingled Write Disk (H-SWD). The H-SWD adopts a window-based hot data identification to effectively manage data in the hot bands and the cold bands such that it can significantly reduce the garbage collection overhead while preventing the random write/update interference. The experimental results with various realistic workloads demonstrates that H-SWD outperforms the Indirection System. Specifically, incorporating a simple hot data identification empowers the H-SWD design to remarkably improve garbage collection performance.","PeriodicalId":278764,"journal":{"name":"2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114500022","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Young-Kyoon Suh, Bongki Moon, A. Efrat, Jin-Soo Kim, Sang-Won Lee
{"title":"Extent Mapping Scheme for Flash Memory Devices","authors":"Young-Kyoon Suh, Bongki Moon, A. Efrat, Jin-Soo Kim, Sang-Won Lee","doi":"10.1109/MASCOTS.2012.45","DOIUrl":"https://doi.org/10.1109/MASCOTS.2012.45","url":null,"abstract":"Flash memory devices commonly rely on traditional address mapping schemes such as page mapping, block mapping or a hybrid of the two. Page mapping is more flexible than block mapping or hybrid mapping without being restricted by block boundaries. However, its mapping table tends to grow large quickly as the capacity of flash memory devices does. To overcome this limitation, we propose a novel mapping scheme that is fundamentally different from the existing mapping strategies. We call this new scheme Virtual Extent Trie (VET), as it manages mapping information by treating each I/O request as an extent and by using extents as basic mapping units rather than pages or blocks. By storing extents instead of individual addresses, VET consumes much less memory to store mapping information and still remains as flexible as page mapping. We observed in our experiments that VET reduced memory consumption by up to an order of magnitude in comparison with the traditional mapping schemes for several real world workloads. The VET scheme also scaled well with increasing address spaces by synthetic workloads. With a binary search mechanism, VET limits the mapping time to O(log log|U |), where U denotes the set of all possible logical addresses. Though the asymptotic mapping cost of VET is higher than the O(1) time of a page mapping scheme, the amount of increased overhead was almost negligible or low enough to be hidden by an accompanying I/O operation.","PeriodicalId":278764,"journal":{"name":"2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"302 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122487051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}