2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems最新文献

筛选
英文 中文
An Offline Demand Estimation Method for Multi-threaded Applications 一种多线程应用的离线需求估计方法
Juan F. Pérez, Sergio Pacheco-Sanchez, G. Casale
{"title":"An Offline Demand Estimation Method for Multi-threaded Applications","authors":"Juan F. Pérez, Sergio Pacheco-Sanchez, G. Casale","doi":"10.1109/MASCOTS.2013.10","DOIUrl":"https://doi.org/10.1109/MASCOTS.2013.10","url":null,"abstract":"Parameterizing performance models for multi-threaded enterprise applications requires finding the service rates offered by worker threads to the incoming requests. Statistical inference on monitoring data is here helpful to reduce the overheads of application profiling and to infer missing information. While linear regression of utilization data is often used to estimate service rates, it suffers erratic performance and also ignores a large part of application monitoring data, e.g., response times. Yet inference from other metrics, such as response times or queue-length samples, is complicated by the dependence on scheduling policies. To address these issues, we propose novel scheduling-aware estimation approaches for multi-threaded applications based on linear regression and maximum likelihood estimators. The proposed methods estimate demands from samples of the number of requests in execution in the worker threads at the admission instant of a new request. Validation results are presented on simulated and real application datasets for systems with multi-class requests, class switching, and admission control.","PeriodicalId":385538,"journal":{"name":"2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127863412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
PAB: Parallelism-Aware Buffer Management Scheme for Nand-Based SSDs 基于nand的固态硬盘并行感知缓冲管理方案
Xufeng Guo, Jianfeng Tan, Yuping Wang
{"title":"PAB: Parallelism-Aware Buffer Management Scheme for Nand-Based SSDs","authors":"Xufeng Guo, Jianfeng Tan, Yuping Wang","doi":"10.1109/MASCOTS.2013.18","DOIUrl":"https://doi.org/10.1109/MASCOTS.2013.18","url":null,"abstract":"Recently, internal buffer module and multi-level parallel components have already become the standard elements of SSDs. The internal buffer module is always used as a write cache, reducing the erasures and thus improving overall performance. The multi-level parallelism is exploited to service requests in a concurrent or interleaving manner, which promotes the system throughput. These two aspects have been extensively discussed in the literature. However, current buffer algorithms cannot take full advantage of parallelism inside SSDs. In this paper, we propose a novel write buffer management scheme called Parallelism-Aware Buffer (PAB). In this scheme, the buffer is divided into two parts named as Work-Zone and Para-Zone respectively. Conventional buffer algorithms are employed in the Work-Zone, while the Para-Zone is responsible for reorganizing the requests evicted from Work-Zone according to the underlying parallelism. Simulation results show that with only a small size of Para-Zone, PAB can achieve 19.2% ~ 68.1% enhanced performance compared with LRU based on a page-mapping FTL, while this improvement scope becomes 5.6% ~ 35.6% compared with BPLRU based on the state-of-the-art block-mapping FTL known as FAST.","PeriodicalId":385538,"journal":{"name":"2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"80 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125497350","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Improving the Revenue, Efficiency and Reliability in Data Center Spot Market: A Truthful Mechanism 提高数据中心现货市场收益、效率和可靠性:一种真实的机制
Kai Song, Y. Yao, L. Golubchik
{"title":"Improving the Revenue, Efficiency and Reliability in Data Center Spot Market: A Truthful Mechanism","authors":"Kai Song, Y. Yao, L. Golubchik","doi":"10.1109/MASCOTS.2013.30","DOIUrl":"https://doi.org/10.1109/MASCOTS.2013.30","url":null,"abstract":"Data centers are typically over-provisioned, in order to meet certain service level agreements (SLAs) under worst-case scenarios (e.g., peak loads). Selling unused instances at discounted prices thus is a reasonable approach for data center providers to off-set the maintenance and operation costs. Spot market models are widely used for pricing and allocating unused instances. In this paper, we focus on mechanism design for a data center spot market (DCSM). Particularly, we propose a mechanism based on a repeated uniform price auction, and prove its truthfulness. In the mechanism, to achieve better quality of service, the flexibility of adjusting bids during job execution is provided, and a bidding adjustment model is also discussed. Four metrics are used to evaluate the mechanism: in addition to the commonly used metrics in auction theory, namely, revenue, efficiency, slowdown and waste are defined to capture the Quality of Service (QoS) provided by DCSMs. We prove that a uniform price action achieves optimal efficiency among all single-price auctions in DCSMs. We also conduct comprehensive simulations to explore the performance of the resulting DCSM. The result show that (1) the bidding adjustment model helps increase the revenue by an average of 5%, and decrease the slowdown and waste by average of 5% and 6%, respectively, (2) our model with repeated uniform price auction outperforms the current Amazon Spot Market by an average of 14% in revenue, 24% in efficiency, 13% in slowdown, and by 14% in waste. Parameter tuning studies are also performed to refine the performance of our mechanism.","PeriodicalId":385538,"journal":{"name":"2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126055384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
A Versatile Performance and Energy Simulation Tool for Composite GPU Global Memory 复合GPU全局内存的多功能性能和能量模拟工具
Bin Wang, Yizheng Jiao, Weikuan Yu, Xipeng Shen, Dong Li, J. Vetter
{"title":"A Versatile Performance and Energy Simulation Tool for Composite GPU Global Memory","authors":"Bin Wang, Yizheng Jiao, Weikuan Yu, Xipeng Shen, Dong Li, J. Vetter","doi":"10.1109/MASCOTS.2013.39","DOIUrl":"https://doi.org/10.1109/MASCOTS.2013.39","url":null,"abstract":"As a cost-effective compute device, Graphic Processing Unit (GPU) has been widely embraced in the field of high performance computing. GPU is characterized by its massive thread-level parallelism and high memory bandwidth. Although GPU has exhibited tremendous potential, recent GPU architecture researches mainly focus on GPU compute units and full system exploration is rare due to the lack of accurate simulators that can reveal hardware organization of both GPU compute units and its memory system. In order to fill this void, we build a GPU simulator called VxGPUSim that can support the simulation with detailed performance, timing and power consumption statistics. Our experimental evaluation demonstrates that VxGPUSim can faithfully reveal the internal execution details of GPU global memory of various memory configurations. It can enable further research on the design of GPU global memory for performance and energy tradeoffs.","PeriodicalId":385538,"journal":{"name":"2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"111 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133102052","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A VoD System for Massively Scaled, Heterogeneous Environments: Design and Implementation 面向大规模异构环境的VoD系统:设计与实现
Kangwook Lee, Lisa Yan, Abhay K. Parekh, K. Ramchandran
{"title":"A VoD System for Massively Scaled, Heterogeneous Environments: Design and Implementation","authors":"Kangwook Lee, Lisa Yan, Abhay K. Parekh, K. Ramchandran","doi":"10.1109/MASCOTS.2013.8","DOIUrl":"https://doi.org/10.1109/MASCOTS.2013.8","url":null,"abstract":"We propose, analyze and implement a general architecture for massively parallel VoD content distribution. We allow for devices that have a wide range of reliability, storage and bandwidth constraints. Each device can act as a cache for other devices and can also communicate with a central server. Some devices may be dedicated caches with no co-located users. Our goal is to allow each user device to be able to stream any movie from a large catalog, while minimizing the load of the central server. First, we architect and formulate a static optimization problem that accounts for various network bandwidth and storage capacity constraints, as well as the maximum number of network connections for each device. Not surprisingly this formulation is NP-hard. We then use a Markov approximation technique in a primal-dual framework to devise a highly distributed algorithm which is provably close to the optimal. Next we test the practical effectiveness of the distributed algorithm in several ways. We demonstrate remarkable robustness to system scale and changes in demand, user churn, network failure and node failures via a packet level simulation of the system. Finally, we describe our results from numerous experiments on a full implementation of the system with 60 caches and 120 users on 20 Amazon EC2 instances. In addition to corroborating our analytical and simulation-based findings, the implementation allows us to examine various system-level tradeoffs. Examples of this include: (i) the split between server to cache and cache to device traffic, (ii) the tradeoff between cache update intervals and the time taken for the system to adjust to changes in demand, and (iii) the tradeoff between the rate of virtual topology updates and convergence. These insights give us the confidence to claim that a much larger system on the scale of hundreds of thousands of highly heterogeneous nodes would perform as well as our current implementation.","PeriodicalId":385538,"journal":{"name":"2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"348 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133102765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Self-Tuning Batching with DVFS for Improving Performance and Energy Efficiency in Servers 用DVFS自调优批处理提高服务器的性能和能源效率
Dazhao Cheng, Yanfei Guo, Xiaobo Zhou
{"title":"Self-Tuning Batching with DVFS for Improving Performance and Energy Efficiency in Servers","authors":"Dazhao Cheng, Yanfei Guo, Xiaobo Zhou","doi":"10.1109/MASCOTS.2013.12","DOIUrl":"https://doi.org/10.1109/MASCOTS.2013.12","url":null,"abstract":"Performance improvement and energy efficiency are two important goals in provisioning Internet services in data center servers. In this paper, we propose and develop a self-tuning request batching mechanism to simultaneously achieve the two correlated goals. The batching mechanism increases the cache hit rate at the front-tier Web server, which provides the opportunity to improve application's performance and energy efficiency of the server system. The core of the batching mechanism is a novel and practical two-layer control system that adaptively adjusts the batching interval and frequency states of CPUs according to the service level agreement and the workload characteristics. The batching control adopts a self-tuning fuzzy model predictive control approach for application performance improvement. The power control dynamically adjusts the frequency of CPUs with DVFS in response to workload fluctuations for energy efficiency. A coordinator between the two control loops achieves the desired performance and energy efficiency. We implement the mechanism in a test bed and experimental results demonstrate that the new approach significantly improves the application's performance in terms of the system throughput and average response time. The results also illustrate it can reduce the energy consumption of the server system by 13% at the same time.","PeriodicalId":385538,"journal":{"name":"2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129369779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Transforming System Load to Throughput for Consolidated Applications 将系统负载转换为综合应用程序的吞吐量
Andrej Podzimek, L. Chen
{"title":"Transforming System Load to Throughput for Consolidated Applications","authors":"Andrej Podzimek, L. Chen","doi":"10.1109/MASCOTS.2013.37","DOIUrl":"https://doi.org/10.1109/MASCOTS.2013.37","url":null,"abstract":"Today's computing systems monitor and collect a large number of system load statistics, e.g., time series of CPU utilization, but utilization traces do not directly reflect application performance, e.g., response time and throughput. Indeed, resource utilization is the output of conventional performance evaluation approaches, such as queueing models and benchmarking, and often for a single application. In this paper, we address the following research question: How to turn utilization traces from consolidated applications into estimates of application performance metrics? To such an end, we developed \"Showstopper\", a novel and light-weight benchmarking methodology and tool which orchestrates execution of multi-threaded benchmarks on a multi-core system in parallel, so that the CPU load follows utilization traces and application performance metrics can thus be estimated efficiently. To generate the desired loads, Showstopper alternates stopped and runnable states of multiple benchmarks in a distributed fashion, dynamically adjusting their duty cycles using feedback control mechanisms. Our preliminary evaluation results show that Showstopper can sustain the target loads within 5% of error and obtain reliable throughput estimates for DaCapo benchmarks executed on Linux/x86-64 platforms.","PeriodicalId":385538,"journal":{"name":"2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130853699","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Towards Machine Learning-Based Auto-tuning of MapReduce 基于机器学习的MapReduce自动调优研究
N. Yigitbasi, Theodore L. Willke, Guangdeng Liao, D. Epema
{"title":"Towards Machine Learning-Based Auto-tuning of MapReduce","authors":"N. Yigitbasi, Theodore L. Willke, Guangdeng Liao, D. Epema","doi":"10.1109/MASCOTS.2013.9","DOIUrl":"https://doi.org/10.1109/MASCOTS.2013.9","url":null,"abstract":"MapReduce, which is the de facto programming model for large-scale distributed data processing, and its most popular implementation Hadoop have enjoyed widespread adoption in industry during the past few years. Unfortunately, from a performance point of view getting the most out of Hadoop is still a big challenge due to the large number of configuration parameters. Currently these parameters are tuned manually by trial and error, which is ineffective due to the large parameter space and the complex interactions among the parameters. Even worse, the parameters have to be re-tuned for different MapReduce applications and clusters. To make the parameter tuning process more effective, in this paper we explore machine learning-based performance models that we use to auto-tune the configuration parameters. To this end, we first evaluate several machine learning models with diverse MapReduce applications and cluster configurations, and we show that support vector regression model (SVR) has good accuracy and is also computationally efficient. We further assess our auto-tuning approach, which uses the SVR performance model, against the Starfish auto tuner, which uses a cost-based performance model. Our findings reveal that our auto-tuning approach can provide comparable or in some cases better performance improvements than Starfish with a smaller number of parameters. Finally, we propose and discuss a complete and practical end-to-end auto-tuning flow that combines our machine learning-based performance models with smart search algorithms for the effective training of the models and the effective exploration of the parameter space.","PeriodicalId":385538,"journal":{"name":"2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"229 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116386652","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 103
Capacity of Simple Multiple-Input-Single-Output Wireless Networks over Uniform or Fractal Maps 均匀或分形地图上简单多输入-单输出无线网络的容量
P. Jacquet
{"title":"Capacity of Simple Multiple-Input-Single-Output Wireless Networks over Uniform or Fractal Maps","authors":"P. Jacquet","doi":"10.1109/MASCOTS.2013.66","DOIUrl":"https://doi.org/10.1109/MASCOTS.2013.66","url":null,"abstract":"We want to estimate the average capacity of MISO networks when several simultaneous emitters and a single access point are randomly distributed in an infinite fractal map embedded in a space of dimension D. We first show that the average capacity is a constant when the nodes are uniformly distributed in the space. This constant is function of the space dimension and of the signal attenuation factor, it holds even in presence of non i.i.d. fading effects. We second extend the analysis to fractal maps with a non integer dimension. In this case the constant still holds with the fractal dimension replacing D but the capacity shows small periodic oscillation around this constant when the node density varies. The practical consequence of this result is that the capacity increases significantly when the network map has a small fractal dimension.","PeriodicalId":385538,"journal":{"name":"2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130771548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Configuring Cloud Admission Policies under Dynamic Demand 配置动态需求下云接入策略
Merve Unuvar, Y. Doganata, A. Tantawi
{"title":"Configuring Cloud Admission Policies under Dynamic Demand","authors":"Merve Unuvar, Y. Doganata, A. Tantawi","doi":"10.1109/MASCOTS.2013.42","DOIUrl":"https://doi.org/10.1109/MASCOTS.2013.42","url":null,"abstract":"We consider the problem of admitting sets of, possibly heterogenous, virtual machines (VMs) with stochastic resource demands onto physical machines (PMs) in a Cloud environment. The objective is to achieve a specified quality-of-service related to the probability of resource over-utilization in an uncertain loading condition, while minimizing the rejection probability of VM requests. We introduce a method which relies on approximating the probability distribution of the total resource demand on PMs and estimating the probability of over-utilization. We compare our method to two simple admission policies: admission based on maximum demand and admission based on average demand. We investigate the efficiency of the results of using our method on a simulated Cloud environment where we analyze the effects of various parameters (commitment factor, coefficient of variation etc.) on the solution for highly variate demands.","PeriodicalId":385538,"journal":{"name":"2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"202 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132501722","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信