2006 IEEE International Symposium on Workload Characterization最新文献

筛选
英文 中文
The Dynamics of Backfilling: Solving the Mystery of Why Increased Inaccuracy May Help 回填动力学:解决为什么不准确性增加的谜团可能会有所帮助
2006 IEEE International Symposium on Workload Characterization Pub Date : 2006-10-01 DOI: 10.1109/IISWC.2006.302737
Dan Tsafrir, D. Feitelson
{"title":"The Dynamics of Backfilling: Solving the Mystery of Why Increased Inaccuracy May Help","authors":"Dan Tsafrir, D. Feitelson","doi":"10.1109/IISWC.2006.302737","DOIUrl":"https://doi.org/10.1109/IISWC.2006.302737","url":null,"abstract":"Parallel job scheduling with backfilling requires users to provide runtime estimates, used by the scheduler to better pack the jobs. Studies of the impact of such estimates on performance have modeled them using a \"badness factor\" f ges 0 in an attempt to capture their inaccuracy (given a runtime r, the estimate is uniformly distributed in [r, (f + 1) middot r]). Surprisingly, inaccurate estimates (f > 0) yielded better performance than accurate ones (f = 0). We explain this by a \"heel and toe\" dynamics that, with f > 0, cause backfilling to approximate shortest-job first scheduling. We further find the effect of systematically increasing f is V-shaped: average wait time and slowdown initially drop, only to rise later on. This happens because higher fs create bigger \"holes\" in the schedule (longer jobs can backfill) and increase the randomness (more long jobs appear as short), thus overshadowing the initial heel-and-toe preference for shorter jobs. The bottom line is that artificial inaccuracy generated by multiplying (real or perfect) estimates by a factor is (1) just a scheduling technique that trades off fairness for performance, and is (2) ill-suited for studying the effect of real inaccuracy. Real estimates are modal (90% of the jobs use the same 20 estimates) and bounded by a maximum (usually the most popular estimate). Therefore, when performing an evaluation, \"increased inaccuracy\" should translate to increased modality. Unlike multiplying, this indeed worsens performance as one would intuitively expect","PeriodicalId":222041,"journal":{"name":"2006 IEEE International Symposium on Workload Characterization","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125963026","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 53
A Quantitative Evaluation of the Contribution of Native Code to Java Workloads 本机代码对Java工作负载贡献的定量评估
2006 IEEE International Symposium on Workload Characterization Pub Date : 2006-10-01 DOI: 10.1109/IISWC.2006.302745
Walter Binder, J. Hulaas, Philippe Moret
{"title":"A Quantitative Evaluation of the Contribution of Native Code to Java Workloads","authors":"Walter Binder, J. Hulaas, Philippe Moret","doi":"10.1109/IISWC.2006.302745","DOIUrl":"https://doi.org/10.1109/IISWC.2006.302745","url":null,"abstract":"Many performance analysis tools for Java focus on tracking executed bytecodes, but provide little support in determining the specific contribution of native code libraries. This paper introduces and assesses a portable approach for characterizing the amount of native code executed by Java applications. A profiling agent based on the JVM Tool Interface (JVMTI) accurately keeps track of all runtime transitions between bytecode and native code. It relies on a combination of JVMTI events, Java Native Interface (JNI) function interception, bytecode instrumentation, and hardware performance counters","PeriodicalId":222041,"journal":{"name":"2006 IEEE International Symposium on Workload Characterization","volume":"152 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115194597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
"Warehouse-Sized Workloads"
2006 IEEE International Symposium on Workload Characterization Pub Date : 2006-10-01 DOI: 10.1109/IISWC.2006.302724
L. Barroso
{"title":"\"Warehouse-Sized Workloads\"","authors":"L. Barroso","doi":"10.1109/IISWC.2006.302724","DOIUrl":"https://doi.org/10.1109/IISWC.2006.302724","url":null,"abstract":"Biography: Dr. Luiz André Barroso is a Distinguished Engineer at Google, where he has worked across several engineering areas, ranging from applications to software infrastructure and hardware design. His projects have included a service to find related academic articles, designing load-balancing server software, fault detection and recovery techniques, RPC-level networking and server performance optimizations, and leading the design of Google's computing platform. Prior to Google he was a member of the Research Staff at Compaq and Digital Equipment Corporations, where his group published extensively on processor and memory system design for database and web server workloads. While at Compaq, he also co-architected and designed Piranha, a scalable shared-memory multiprocessor based on single-chip multiprocessing. Their work on Piranha has had a significant impact in the microprocessor industry, helping inspire many of the multi-core CPUs that are now in the mainstream. Before joining Digital he was one of the designers of the USC RPM, an FPGA-based multiprocessor emulator for rapid hardware prototyping. He has also worked at IBM Brazil's Rio Scientific Center and lectured at PUC-Rio (Brazil) and Stanford University.","PeriodicalId":222041,"journal":{"name":"2006 IEEE International Symposium on Workload Characterization","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124883122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Modeling Cache Sharing on Chip Multiprocessor Architectures 芯片多处理器架构上的缓存共享建模
2006 IEEE International Symposium on Workload Characterization Pub Date : 2006-10-01 DOI: 10.1109/IISWC.2006.302740
Pavlos Petoumenos, G. Keramidas, Håkan Zeffer, S. Kaxiras, Erik Hagersten
{"title":"Modeling Cache Sharing on Chip Multiprocessor Architectures","authors":"Pavlos Petoumenos, G. Keramidas, Håkan Zeffer, S. Kaxiras, Erik Hagersten","doi":"10.1109/IISWC.2006.302740","DOIUrl":"https://doi.org/10.1109/IISWC.2006.302740","url":null,"abstract":"As CMPs are emerging as the dominant architecture for a wide range of platforms (from embedded systems and game consoles, to PCs, and to servers) the need to manage on-chip resources, such as shared caches, becomes a necessity. In this paper we propose a new statistical model of a CMP shared cache which not only describes cache sharing but also its management via a novel fine-grain mechanism. Our model, called StatShare, accurately describes the behavior of the sharing threads using run-time information (reuse-distance information for memory accesses) and helps us understand how effectively each thread uses its space. The mechanism to manage the cache at the cache-line granularity is inspired by cache decay, but contains important differences. Decayed cache-lines are not turned-off to save leakage but are rather \"available for replacement.\" Decay modifies the underlying replacement policy (random, LRU) to control sharing but in a very flexible and non-strict way which makes it superior to strict cache partitioning schemes (both fine and coarse grained). The statistical model allows us to assess a thread's cache behavior under decay. Detailed CMP simulations show that: i) StatShare accurately predicts the thread behavior in a shared cache, ii) managing sharing via decay (in combination with the StatShare run time information) can be used to enforce external QoS requirements or various high-level fairness policies","PeriodicalId":222041,"journal":{"name":"2006 IEEE International Symposium on Workload Characterization","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128207107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 31
Application-Aware Power Management 应用感知电源管理
2006 IEEE International Symposium on Workload Characterization Pub Date : 2006-10-01 DOI: 10.1109/IISWC.2006.302728
K. Rajamani, H. Hanson, J. Rubio, S. Ghiasi, F. Rawson
{"title":"Application-Aware Power Management","authors":"K. Rajamani, H. Hanson, J. Rubio, S. Ghiasi, F. Rawson","doi":"10.1109/IISWC.2006.302728","DOIUrl":"https://doi.org/10.1109/IISWC.2006.302728","url":null,"abstract":"This paper presents our approach for application-aware power management. We combine continuous monitoring of critical workload indicators, online power and performance model usage and timely p-state control to realize application-aware power management. We present two new power management solutions enabled by our methodology: PerformanceMaximizer (PM) finds the best possible performance under specified power constraints and PowerSave (PS) saves energy while keeping performance above specified requirements. We evaluate both using the SPEC-CPU2000 suite on a Pentium M platform discussing implications of workload characteristics and benefits of being workload-aware","PeriodicalId":222041,"journal":{"name":"2006 IEEE International Symposium on Workload Characterization","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116810740","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 75
Predicting Bounds on Queuing Delay in Space-shared Computing Environments 空间共享计算环境下排队延迟边界预测
2006 IEEE International Symposium on Workload Characterization Pub Date : 2006-10-01 DOI: 10.1109/IISWC.2006.302746
J. Brevik, Daniel Nurmi, R. Wolski
{"title":"Predicting Bounds on Queuing Delay in Space-shared Computing Environments","authors":"J. Brevik, Daniel Nurmi, R. Wolski","doi":"10.1109/IISWC.2006.302746","DOIUrl":"https://doi.org/10.1109/IISWC.2006.302746","url":null,"abstract":"Most space-sharing resources presently operated by high performance computing centers employ some sort of batch queueing system to manage resource allocation to multiple users. In this work, we explore a new method for providing end-users with predictions of the bounds on queuing delay individual jobs will experience when waiting to be scheduled to a machine partition. We evaluate this method using scheduler logs that cover a 10 year period from 10 large HPC systems. Our results show that it is possible to predict delay bounds with specified confidence levels for jobs in different queues, and for jobs requesting different ranges of processor counts","PeriodicalId":222041,"journal":{"name":"2006 IEEE International Symposium on Workload Characterization","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131185108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
Techniques for Real-System Characterization of Java Virtual Machine Energy and Power Behavior Java虚拟机能量和功率行为的实时系统表征技术
2006 IEEE International Symposium on Workload Characterization Pub Date : 2006-10-01 DOI: 10.1109/IISWC.2006.302727
Gilberto Contreras, M. Martonosi
{"title":"Techniques for Real-System Characterization of Java Virtual Machine Energy and Power Behavior","authors":"Gilberto Contreras, M. Martonosi","doi":"10.1109/IISWC.2006.302727","DOIUrl":"https://doi.org/10.1109/IISWC.2006.302727","url":null,"abstract":"The Java platform has been adopted in a wide variety of systems ranging from portable embedded devices to high-end commercial servers. As energy, power dissipation, and thermal challenges begin to affect all design spaces, Java Virtual Machines will need to evolve in order to respond to these and other emerging issues. Developing a power-conscious Java runtime system begins with a detailed per-component understanding of the energy, performance and power behavior of the system, as well as each component's impact on overall application execution. This paper presents techniques for characterizing Java power and performance, as well as results from applying these techniques to the Jikes RVM, for some of the most salient Java Virtual Machine components. Components studied include the garbage collector, the class loader, and the runtime compilation subsystem. Real-system measurements with our efficient, low-perturbation infrastructure offer valuable insights that can aid virtual machine designers in improving energy-efficiency. For example, our results show that JVM energy consumption can comprise as much as 60% of the total energy consumed. In addition, we find that generational garbage collectors offer the best energy-performance for small heap sizes and that this efficiency is challenged by non-generational collectors for large heaps. Overall, given the rising importance of Java systems and of power/thermal challenges, this paper's detailed real-systems examination can lend useful insights for many real-world systems","PeriodicalId":222041,"journal":{"name":"2006 IEEE International Symposium on Workload Characterization","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114151285","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Performance Analysis of Sequence Alignment Applications 序列比对应用的性能分析
2006 IEEE International Symposium on Workload Characterization Pub Date : 2006-10-01 DOI: 10.1109/IISWC.2006.302729
Friman Sánchez, E. Salamí, Alex Ramírez, M. Valero
{"title":"Performance Analysis of Sequence Alignment Applications","authors":"Friman Sánchez, E. Salamí, Alex Ramírez, M. Valero","doi":"10.1109/IISWC.2006.302729","DOIUrl":"https://doi.org/10.1109/IISWC.2006.302729","url":null,"abstract":"Advances in molecular biology have led to a continued growth in the biological information generated by the scientific community. Additionally, this area has become a multi-disciplinary field, including components of mathematics, biology, chemistry, and computer science, generating several challenges in the scientific community from different points of view. For this reason, bioinformatic applications represent an increasingly important workload. However, even though the importance of this field is clear, common bioinformatic applications and their implications on micro-architectural design have not received enough attention from the computer architecture community. This paper presents a micro-architecture performance analysis of recognized bioinformatic applications for the comparison and alignment of biological sequences, including BLAST, FASTA and some recognized parallel implementations of the Smith-Waterman algorithm that use the Altivec SIMD extension to speed-up the performance. We adopt a simulation-based methodology to perform a detailed workload characterization. We analyze architectural and micro-architectural aspects like pipeline configurations, issue widths, functional unit mixes, memory hierarchy and their implications on the performance behavior. We have found that the memory subsystem is the component with more impact in the performance of the BLAST heuristic, the branch predictor is responsible for the major performance loss for FASTA and SSEARCH34, and long dependency chains are the limiting factor in the SIMD implementations of Smith-Waterman","PeriodicalId":222041,"journal":{"name":"2006 IEEE International Symposium on Workload Characterization","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124934897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Performance Cloning: A Technique for Disseminating Proprietary Applications as Benchmarks 性能克隆:一种将专有应用程序作为基准进行传播的技术
2006 IEEE International Symposium on Workload Characterization Pub Date : 2006-10-01 DOI: 10.1109/IISWC.2006.302734
A. Joshi, L. Eeckhout, R. Bell, L. John
{"title":"Performance Cloning: A Technique for Disseminating Proprietary Applications as Benchmarks","authors":"A. Joshi, L. Eeckhout, R. Bell, L. John","doi":"10.1109/IISWC.2006.302734","DOIUrl":"https://doi.org/10.1109/IISWC.2006.302734","url":null,"abstract":"Many embedded real world applications are intellectual property, and vendors hesitate to share these proprietary applications with computer architects and designers. This poses a serious problem for embedded microprocessor designers - how do they customize the design of their microprocessor to provide optimal performance for a class of target customer applications? In this paper, we explore a technique that can automatically extract key performance attributes of a real world application and clone them into a synthetic benchmark. The advantage of the synthetic benchmark clone is that it hides functional meaning of the code but exhibits similar performance characteristics as the target application. Unlike previously proposed workload synthesis techniques, we only model microarchitecture-independent performance attributes into the synthetic clone. By using a set of embedded benchmarks from the MediaBench and MiBench suites, we demonstrate that the performance and power consumption of the synthetic clone correlates well with that of the original application across a wide range of microarchitecture configurations","PeriodicalId":222041,"journal":{"name":"2006 IEEE International Symposium on Workload Characterization","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115533751","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 58
Clustering Application Benchmark 集群应用程序基准测试
2006 IEEE International Symposium on Workload Characterization Pub Date : 2006-10-01 DOI: 10.1109/IISWC.2006.302742
Oguz Altun, Nilgun Dursunoglu, M. Amasyali
{"title":"Clustering Application Benchmark","authors":"Oguz Altun, Nilgun Dursunoglu, M. Amasyali","doi":"10.1109/IISWC.2006.302742","DOIUrl":"https://doi.org/10.1109/IISWC.2006.302742","url":null,"abstract":"An application benchmark based on a set of clustering algorithms is described in this paper. The details of algorithms (K-means online, K-means batch, SOM-1 dimension, SOM-2 dimension, hierarchical K-means online and hierarchical SOM-1 dimension) are given. The code provided complies with ANSI C specifications, as a result is highly portable. The benchmark has been tested on various platforms using different compilers","PeriodicalId":222041,"journal":{"name":"2006 IEEE International Symposium on Workload Characterization","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128385966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信