2011 IEEE International Symposium on Workload Characterization (IISWC)最新文献_第3页

Two-level soft error vulnerability prediction on SMT/CMP architectures SMT/CMP体系结构两级软错误漏洞预测

2011 IEEE International Symposium on Workload Characterization (IISWC) Pub Date : 2011-11-06 DOI: 10.1109/IISWC.2011.6114203

Lide Duan, Lu Peng, Bin Li

引用次数: 4

Performance characteristics of Graph500 on large-scale distributed environment Graph500在大规模分布式环境下的性能特征

2011 IEEE International Symposium on Workload Characterization (IISWC) Pub Date : 2011-11-06 DOI: 10.1109/IISWC.2011.6114175

T. Suzumura, Koji Ueno, Hitoshi Sato, K. Fujisawa, S. Matsuoka

{"title":"Performance characteristics of Graph500 on large-scale distributed environment","authors":"T. Suzumura, Koji Ueno, Hitoshi Sato, K. Fujisawa, S. Matsuoka","doi":"10.1109/IISWC.2011.6114175","DOIUrl":"https://doi.org/10.1109/IISWC.2011.6114175","url":null,"abstract":"Graph500 is a new benchmark for supercomputers based on large-scale graph analysis, which is becoming an important form of analysis in many real-world applications. Graph algorithms run well on supercomputers with shared memory. For the Linpack-based supercomputer rankings, TOP500 reports that heterogeneous and distributed-memory super-computers with large numbers of GPGPUs are becoming dominant. However, the performance characteristics of large-scale graph analysis benchmarks such as Graph500 on distributed-memory supercomputers have so far received little study. This is the first report of a performance evaluation and analysis for Graph500 on a commodity-processor-based distributed-memory supercomputer. We found that the reference implementation “replicated-csr” based on distributed level-synchronized breadth-first search solves a large free graph problem with 231 vertices and 235 edges (approximately 2.15 billon vertices and 34.3 billion edges) in 3.09 seconds with 128 nodes and 3,072 cores. This equates to 11 giga-edges traversed per second. We describe the algorithms and implementations of the reference implementations of Graph500, and analyze the performance characteristics with varying graph sizes and numbers of computer nodes and different implementations. Our results will also contribute to the development of optimized algorithms for the coming exascale machines.","PeriodicalId":367515,"journal":{"name":"2011 IEEE International Symposium on Workload Characterization (IISWC)","volume":"173 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127659069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 59

Scalability analysis of enterprise javaworkloads on a multi-core system 企业java工作负载在多核系统上的可伸缩性分析

2011 IEEE International Symposium on Workload Characterization (IISWC) Pub Date : 2011-11-06 DOI: 10.1109/IISWC.2011.6114202

X. Guerin, Yanbin Liu, Parijat Dube, Seetharami R. Seelam, Pierre-Andre Paumelle

引用次数: 0

Ranking commercial machines through data transposition 通过数据转换对商用机器进行排名

2011 IEEE International Symposium on Workload Characterization (IISWC) Pub Date : 2011-11-06 DOI: 10.1109/IISWC.2011.6114192

Beau Piccart, A. Georges, H. Blockeel, L. Eeckhout

{"title":"Ranking commercial machines through data transposition","authors":"Beau Piccart, A. Georges, H. Blockeel, L. Eeckhout","doi":"10.1109/IISWC.2011.6114192","DOIUrl":"https://doi.org/10.1109/IISWC.2011.6114192","url":null,"abstract":"The performance numbers reported by benchmarking consortia and corporations provide little or no insight into the performance of applications of interest that are not part of the benchmark suite. This paper describes data transposition, a novel methodology for addressing this ubiquitous benchmarking problem. Data transposition predicts the performance for an application of interest on a target machine based on its performance similarities with the industry-standard benchmarks on a limited number of predictive machines. The key idea of data transposition is to exploit machine similarity rather than workload similarity as done in prior work, i.e., data transposition identifies a predictive machine that is most similar to the target machine of interest for predicting performance for the application of interest. We demonstrate the accuracy and effectiveness of data transposition using the SPEC CPU2006 benchmarks and a set of 117 commercial machines. We report that the machine ranking obtained through data transposition correlates well with the machine ranking obtained using measured performance numbers (average correlation coefficient of 0.93). Not only does data transposition improve average correlation, we also demonstrate that data transposition is more robust towards outlier benchmarks, i.e., the worst-case correlation coefficient improves from 0.59 by prior art to 0.71. More concretely, using data transposition to predict the top-1 machine for an application of interest leads to the best performing machine for most workloads (average deficiency of 1.2% and max deficiency of 24.8% for one benchmark), whereas prior work leads to deficiencies over 100% for some workloads.","PeriodicalId":367515,"journal":{"name":"2011 IEEE International Symposium on Workload Characterization (IISWC)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126879000","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 13

A performance study on operator-based stream processing systems 基于算子的流处理系统性能研究

2011 IEEE International Symposium on Workload Characterization (IISWC) Pub Date : 2011-11-01 DOI: 10.1109/IISWC.2011.6114204

Miyuru Dayarathna, Souhei Takeno, T. Suzumura

引用次数: 8

Characterization of real workloads of web search engines web搜索引擎实际工作负载的表征

2011 IEEE International Symposium on Workload Characterization (IISWC) Pub Date : 2011-11-01 DOI: 10.1109/IISWC.2011.6114193

Huafeng Xi, Jianfeng Zhan, Zhen Jia, Xuehai Hong, Lei Wang, Lixin Zhang, Ninghui Sun, Gang Lu

{"title":"Characterization of real workloads of web search engines","authors":"Huafeng Xi, Jianfeng Zhan, Zhen Jia, Xuehai Hong, Lei Wang, Lixin Zhang, Ninghui Sun, Gang Lu","doi":"10.1109/IISWC.2011.6114193","DOIUrl":"https://doi.org/10.1109/IISWC.2011.6114193","url":null,"abstract":"Search is the most heavily used web application in the world and is still growing at an extraordinary rate. Understanding the behaviors of web search engines, therefore, is becoming increasingly important to the design and deployment of data center systems hosting search engines. In this paper, we study three search query traces collected from real world web search engines in three different search service providers. The first part of our study is to uncover the patterns hidden in the query traces by analyzing the variations, frequencies, and locality of query requests. Our analysis reveals that, contradicted to some previous studies, real-world query traces do not follow well-defined probability models, such as Poisson distribution and log-normal distribution. The second part of our study is to deploy the real query traces and three synthetic traces generated using probability models proposed by other researchers on a Nutch based search engine. The measured performance data from the deployments further confirm that synthetic traces do not accurately reflect the real traces. We develop an evaluation tool that can collect performance metrics on-line with negligible overhead. The performance metrics include average response time, CPU utilization, Disk accesses, and cycles-per-instructions, etc. The third of our study is to compare the search engine with representative benchmarks, namely Gridmix, SPECweb2005, TPC-C, SPECCPU2006, and HPCC, with respect to basic architecture-level characteristics and performance metrics, such as instruction mix, processor pipeline stall breakdown, memory access latency, and disk accesses. The experimental results show that web search engines have a high percentage of load/store instructions, but have good cache/memory performance. We hope those results presented in this paper will enable system designers to gain insights on optimizing systems hosting search engines.","PeriodicalId":367515,"journal":{"name":"2011 IEEE International Symposium on Workload Characterization (IISWC)","volume":"114 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115389491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 28

Program Interferometry 计划干涉法

2011 IEEE International Symposium on Workload Characterization (IISWC) Pub Date : 2011-10-10 DOI: 10.1109/IISWC.2011.6114177

Zhe Wang, Daniel A. Jiménez

引用次数: 2