2013 IEEE International Symposium on Workload Characterization (IISWC)最新文献

Revisiting the management control plane in virtualized cloud computing infrastructure 虚拟化云计算基础架构中管理控制平面的回顾

2013 IEEE International Symposium on Workload Characterization (IISWC) Pub Date : 2013-09-01 DOI: 10.1109/IISWC.2013.6704680

V. Soundararajan, Lawrence Spracklen

引用次数: 1

A structured approach to the simulation, analysis and characterization of smartphone applications 一个结构化的方法来模拟，分析和表征智能手机应用程序

2013 IEEE International Symposium on Workload Characterization (IISWC) Pub Date : 2013-09-01 DOI: 10.1109/IISWC.2013.6704677

Dam Sunwoo, William Wang, Mrinmoy Ghosh, Chander Sudanthi, G. Blake, C. D. Emmons, N. Paver

{"title":"A structured approach to the simulation, analysis and characterization of smartphone applications","authors":"Dam Sunwoo, William Wang, Mrinmoy Ghosh, Chander Sudanthi, G. Blake, C. D. Emmons, N. Paver","doi":"10.1109/IISWC.2013.6704677","DOIUrl":"https://doi.org/10.1109/IISWC.2013.6704677","url":null,"abstract":"Full-system simulators are invaluable tools for designing new architectures due to their ability to simulate full applications as well as capture operating system behavior, virtual machine or hypervisor behavior, and interference between concurrently-running applications. However, the systems under investigation and applications under test have become increasingly complicated leading to prohibitively long simulation times for a single experiment. This problem is compounded when many permutations of system design parameters and workloads are tested to investigate system sensitivities and full-system effects with confidence. In this paper, we propose a methodology to tractably explore the processor design space and to characterize applications in a full-system simulation environment. We combine SimPoint, Principal Component Analysis and Fractional Factorial experimental designs to substantially reduce the simulation effort needed to characterize and analyze workloads. We also present a non-invasive user-interface automation tool to allow us to study all types of workloads in a simulation environment. While our methodology is generally applicable to many simulators and workloads, we demonstrate the application of our proposed flow on smartphone applications running on the Android operating system within the gem5 simulation environment.","PeriodicalId":365868,"journal":{"name":"2013 IEEE International Symposium on Workload Characterization (IISWC)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125560488","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 35

Characterizing the efficiency of data deduplication for big data storage management

2013 IEEE International Symposium on Workload Characterization (IISWC) Pub Date : 2013-09-01 DOI: 10.1109/IISWC.2013.6704674

Ruijin Zhou, Ming Liu, Tao Li

{"title":"Characterizing the efficiency of data deduplication for big data storage management","authors":"Ruijin Zhou, Ming Liu, Tao Li","doi":"10.1109/IISWC.2013.6704674","DOIUrl":"https://doi.org/10.1109/IISWC.2013.6704674","url":null,"abstract":"The demand for data storage and processing is increasing at a rapid speed in the big data era. Such a tremendous amount of data pushes the limit on storage capacity and on the storage network. A significant portion of the dataset in big data workloads is redundant. As a result, deduplication technology, which removes replicas, becomes an attractive solution to save disk space and traffic in a big data environment. However, the overhead of extra CPU computation (hash indexing) and IO latency introduced by deduplication should be considered. Therefore, the net effect of using deduplication for big data workloads needs to be examined. To this end, we characterize the redundancy of typical big data workloads to justify the need for deduplication. We analyze and characterize the performance and energy impact brought by deduplication under various big data environments. In our experiments, we identify three sources of redundancy in big data workloads: 1) deploying more nodes, 2) expanding the dataset, and 3) using replication mechanisms. We elaborate on the advantages and disadvantages of different deduplication layers, locations, and granularities. In addition, we uncover the relation between energy overhead and the degree of redundancy. Furthermore, we investigate the deduplication efficiency in an SSD environment for big data workloads.","PeriodicalId":365868,"journal":{"name":"2013 IEEE International Symposium on Workload Characterization (IISWC)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129763560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 39

(Mis)understanding the NUMA memory system performance of multithreaded workloads (2)对多线程工作负载下NUMA内存系统性能的理解不足

2013 IEEE International Symposium on Workload Characterization (IISWC) Pub Date : 2013-09-01 DOI: 10.1109/IISWC.2013.6704666

Z. Majó, T. Gross

{"title":"(Mis)understanding the NUMA memory system performance of multithreaded workloads","authors":"Z. Majó, T. Gross","doi":"10.1109/IISWC.2013.6704666","DOIUrl":"https://doi.org/10.1109/IISWC.2013.6704666","url":null,"abstract":"An important aspect of workload characterization is understanding memory system performance (i.e., understanding a workload's interaction with the memory system). On systems with a non-uniform memory architecture (NUMA) the performance critically depends on the distribution of data and computations. The actual memory access patterns have a large influence on performance on systems with aggressive prefetcher units. This paper describes an analysis of the memory system performance of multithreaded programs and shows that some programs are (unintentionally) structured so that they use the memory system of today's NUMA-multicores inefficiently: Programs exhibit program-level data sharing, a performance-limiting factor that makes data and computation distribution in NUMA systems difficult. Moreover, many programs have irregular memory access patterns that are hard to predict by processor prefetcher units. The memory system performance as observed for a given program on a specific platform depends also on many algorithm and implementation decisions. The paper shows that a set of simple algorithmic changes coupled with commonly available OS functionality suffice to eliminate data sharing and to regularize the memory access patterns for a subset of the PARSEC parallel benchmarks. These simple source-level changes result in performance improvements of up to 3.1X, but more importantly, they lead to a fairer and more accurate performance evaluation on NUMA-multicore systems. They also illustrate the importance of carefully considering all details of algorithms and architectures to avoid drawing incorrect conclusions.","PeriodicalId":365868,"journal":{"name":"2013 IEEE International Symposium on Workload Characterization (IISWC)","volume":"78 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127666871","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 38

Do C and Java programs scale differently on Hardware Transactional Memory? C和Java程序在硬件事务性内存上的伸缩是否不同?

2013 IEEE International Symposium on Workload Characterization (IISWC) Pub Date : 2013-09-01 DOI: 10.1109/IISWC.2013.6704668

Rei Odaira, J. Castaños, T. Nakaike

引用次数: 5

Semantic characterization of MapReduce workloads MapReduce工作负载的语义表征

2013 IEEE International Symposium on Workload Characterization (IISWC) Pub Date : 2013-09-01 DOI: 10.1109/IISWC.2013.6704673

Zhihong Xu, Martin Hirzel, G. Rothermel

引用次数: 11

Performance, energy characterizations and architectural implications of an emerging mobile platform benchmark suite - MobileBench 一个新兴的移动平台基准测试套件——MobileBench的性能、能耗特征和架构含义

2013 IEEE International Symposium on Workload Characterization (IISWC) Pub Date : 2013-09-01 DOI: 10.1109/IISWC.2013.6704679

D. Pandiyan, Shin-Ying Lee, Carole-Jean Wu

{"title":"Performance, energy characterizations and architectural implications of an emerging mobile platform benchmark suite - MobileBench","authors":"D. Pandiyan, Shin-Ying Lee, Carole-Jean Wu","doi":"10.1109/IISWC.2013.6704679","DOIUrl":"https://doi.org/10.1109/IISWC.2013.6704679","url":null,"abstract":"In this paper, we explore key microarchitectural features of mobile computing platforms that are crucial to the performance of smart phone applications. We create and use a selection of representative smart phone applications, which we call MobileBench that aid in this analysis. We also evaluate the effectiveness of current memory subsystem on the mobile platforms. Furthermore, by instrumenting the Android framework, we perform energy characterization for MobileBench on an existing Samsung Galaxy S III smart phone. Based on our energy analysis, we find that application cores on modern smart phones consume significant amount of energy. This motivates our detailed performance analysis centered at the application cores. Based on our detailed performance studies, we reach several key findings. (i) Using a more sophisticated tournament branch predictor can improve the branch prediction accuracy but this does not translate to observable performance gain. (ii) Smart phone applications show distinct TLB capacity needs. Larger TLBs can improve performance by an avg. of 14%. (iii) The current L2 cache on most smart phone platform experiences poor utilization because of the fast-changing memory requirements of smart phone applications. Using a more effective cache management scheme improves the L2 cache utilization by as much as 29.3% and by an avg. of 12%. (iv) Smart phone applications are prefetching-friendly. Using a simple stride prefetcher can improve performance across MobileBench applications by an avg. of 14%. (v) Lastly, the memory bandwidth requirements of MobileBench applications are moderate and well under current smart phone memory bandwidth capacity of 8.3 GB/s. With these insights into the smart phone application characteristics, we hope to guide the design of future smart phone platforms for lower power consumptions through simpler architecture while achieving high performance.","PeriodicalId":365868,"journal":{"name":"2013 IEEE International Symposium on Workload Characterization (IISWC)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115786326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 68

Hardware-independent application characterization 独立于硬件的应用程序特性

2013 IEEE International Symposium on Workload Characterization (IISWC) Pub Date : 2013-09-01 DOI: 10.2172/1214640

S. Pakin, P. McCormick

引用次数: 18

Modeling virtual machines misprediction overhead 虚拟机建模错误预测开销

2013 IEEE International Symposium on Workload Characterization (IISWC) Pub Date : 2013-09-01 DOI: 10.1109/IISWC.2013.6704681

D. C. S. Lucas, R. Auler, Rafael Dalibera, S. Rigo, E. Borin, G. Araújo

引用次数: 5

Pannotia: Understanding irregular GPGPU graph applications Pannotia:理解不规则的GPGPU图形应用

2013 IEEE International Symposium on Workload Characterization (IISWC) Pub Date : 2013-09-01 DOI: 10.1109/IISWC.2013.6704684

Shuai Che, Bradford M. Beckmann, S. Reinhardt, K. Skadron

{"title":"Pannotia: Understanding irregular GPGPU graph applications","authors":"Shuai Che, Bradford M. Beckmann, S. Reinhardt, K. Skadron","doi":"10.1109/IISWC.2013.6704684","DOIUrl":"https://doi.org/10.1109/IISWC.2013.6704684","url":null,"abstract":"GPUs have become popular recently to accelerate general-purpose data-parallel applications. However, most existing work has focused on GPU-friendly applications with regular data structures and access patterns. While a few prior studies have shown that some irregular workloads can also achieve speedups on GPUs, this domain has not been investigated thoroughly. Graph applications are one such set of irregular workloads, used in many commercial and scientific domains. In particular, graph mining -as well as web and social network analysis- are promising applications that GPUs could accelerate. However, implementing and optimizing these graph algorithms on SIMD architectures is challenging because their data-dependent behavior results in significant branch and memory divergence. To address these concerns and facilitate research in this area, this paper presents and characterizes a suite of GPGPU graph applications, Pannotia, which is implemented in OpenCL and contains problems from diverse and important graph application domains. We perform a first-step characterization and analysis of these benchmarks and study their behavior on real hardware. We also use clustering analysis to illustrate the similarities and differences of the applications in the suite. Finally, we make architectural and scheduling suggestions that will improve their execution efficiency on GPUs.","PeriodicalId":365868,"journal":{"name":"2013 IEEE International Symposium on Workload Characterization (IISWC)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114753023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 178