IEEE International Symposium on - ISPASS Performance Analysis of Systems and Software, 2004最新文献

Cache implications of aggressively pipelined high performance microprocessors 积极流水线高性能微处理器的缓存含义

IEEE International Symposium on - ISPASS Performance Analysis of Systems and Software, 2004 Pub Date : 2004-03-10 DOI: 10.1109/ISPASS.2004.1291364

T. J. Dysart, Branden J. Moore, Lambert Schaelicke, P. Kogge

引用次数: 5

A co-phase matrix to guide simultaneous multithreading simulation 一个指导同步多线程仿真的同相矩阵

IEEE International Symposium on - ISPASS Performance Analysis of Systems and Software, 2004 Pub Date : 2004-03-10 DOI: 10.1109/ISPASS.2004.1291355

Michael Van Biesbrouck, T. Sherwood, B. Calder

{"title":"A co-phase matrix to guide simultaneous multithreading simulation","authors":"Michael Van Biesbrouck, T. Sherwood, B. Calder","doi":"10.1109/ISPASS.2004.1291355","DOIUrl":"https://doi.org/10.1109/ISPASS.2004.1291355","url":null,"abstract":"Several commercial processors have architectures that include support for simultaneous multithreading (SMT), yet there is still not a validated methodology for estimating the performance of an SMT machine that does not rely on full program simulation. To create an efficient sampling approach for SMT we must determine how far to fast-forward each individual thread between samples. The fast-forwarding distance for each thread will vary according to execution phases, thread interactions and changes to the architectural configuration. We examine using individual program phase information to guide SMT simulation. This is accomplished by creating what we call a co-phase matrix. The co-phase matrix is populated by collecting samples of the programs' phase combinations, and is used to guide fastforwarding between samples. We show for 28 pairs of SPEC programs that using the co-phase matrix provides an average error rate of 4% while requiring that only 1% of the full simulation be performed. The methods are also validated using a variety of architectural configurations and four-threaded workloads.","PeriodicalId":188291,"journal":{"name":"IEEE International Symposium on - ISPASS Performance Analysis of Systems and Software, 2004","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124505966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 99

Spectral analysis for characterizing program power and performance 频谱分析表征程序功率和性能

IEEE International Symposium on - ISPASS Performance Analysis of Systems and Software, 2004 Pub Date : 2004-03-10 DOI: 10.1109/ISPASS.2004.1291367

R. Joseph, M. Martonosi, Zhigang Hu

{"title":"Spectral analysis for characterizing program power and performance","authors":"R. Joseph, M. Martonosi, Zhigang Hu","doi":"10.1109/ISPASS.2004.1291367","DOIUrl":"https://doi.org/10.1109/ISPASS.2004.1291367","url":null,"abstract":"Performance and power analysis in modern processors requires managing a large amount of complex information across many time-scales. For a example, thermal control issues are a power subproblem with relevant time constants of millions of cycles or more, while the so-called dI/dT problem is also a power subproblem but occurs because of current variability on a much finer granularity: tens to hundreds of cycles. Likewise, for performance issues, program phase analysis for selecting simulation regions requires looking for periodicity on the order of millions of cycles or more, while some aspects of cache performance optimization requires understanding repetitive patterns on much finer granularities. Fourier analysis allows one to transform waveform into a sum of component (usually sinusoidal) waveforms in the frequency domain; in this way, the waveform's fundamental frequencies (periodicities of repetition) can be clearly identified. This paper shows how one can use Fourier analysis to produce frequency spectra for some of the time waveforms seen in processor execution. By working in the frequency domain, one can easily identify key application tendencies. For example, we demonstrate how to use spectral analysis to characterize the power behavior of real programs. As we show, this is useful for understanding both the temperature profile of a program and its voltage stability. These are particularly relevant issues for architects since thermal concerns and the dI/dT problem have significant influence on processor design. Frequency analysis can also be used to examine program performance. In particular, it can also identify periodic occurrences of important microarchitectural events like cache misses. Overall, the paper demonstrates the value of using frequency analysis in different research efforts related to characterizing and optimizing application performance and power.","PeriodicalId":188291,"journal":{"name":"IEEE International Symposium on - ISPASS Performance Analysis of Systems and Software, 2004","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125735218","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 12

StatCache: a probabilistic approach to efficient and accurate data locality analysis StatCache:一种高效准确的数据局部性分析的概率方法

IEEE International Symposium on - ISPASS Performance Analysis of Systems and Software, 2004 Pub Date : 2004-03-10 DOI: 10.1109/ISPASS.2004.1291352

Erik Berg, Erik Hagersten

引用次数: 193

Efficient architectural design of high performance microprocessors 高性能微处理器的高效架构设计

IEEE International Symposium on - ISPASS Performance Analysis of Systems and Software, 2004 Pub Date : 2004-03-10 DOI: 10.1016/S0065-2458(03)61002-8

L. Eeckhout, K. D. Bosschere

引用次数: 0

Deconstructing commit 解构提交

IEEE International Symposium on - ISPASS Performance Analysis of Systems and Software, 2004 Pub Date : 2004-03-10 DOI: 10.1109/ISPASS.2004.1291357

Gordon B. Bell, Mikko H. Lipasti

{"title":"Deconstructing commit","authors":"Gordon B. Bell, Mikko H. Lipasti","doi":"10.1109/ISPASS.2004.1291357","DOIUrl":"https://doi.org/10.1109/ISPASS.2004.1291357","url":null,"abstract":"Many modern processors execute instructions out of their original program order to exploit instruction-level parallelism and achieve higher performance. However even though instructions can execute in an arbitrary order, they must eventually commit, or retire from execution, in program order. This constraint provides a safety mechanism to ensure that mis-speculated instructions are not inadvertently committed, but can consume valuable processor resources and severely limit the degree of parallelism exposed in a program. We assert that such a constraint is overly conservative, and propose conditions under which it can be relaxed. This paper deconstructs the notion of commit in an out-of-order processor, and examines the set of necessary conditions under which instructions can be permitted to retire out of program order. It provides a detailed analysis of the frequency and relative importance of these conditions, and discusses microarchitectural modifications that relax the in-order commit requirement. Overall, we found that for a given set of processor resources our technique achieves speedups of up to 68% and 8% for floating point and integer benchmarks, respectively. Conversely, because out-of-order commit allows more efficient utilization of cycle-time limiting resources, it can alternatively enable simpler designs with potentially higher clock frequencies.","PeriodicalId":188291,"journal":{"name":"IEEE International Symposium on - ISPASS Performance Analysis of Systems and Software, 2004","volume":"97 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134336052","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 46

Dynamic pretenuring schemes for generational garbage collection 分代垃圾收集的动态假装方案

IEEE International Symposium on - ISPASS Performance Analysis of Systems and Software, 2004 Pub Date : 2004-03-10 DOI: 10.1109/ISPASS.2004.1291365

Wei Huang, W. Srisa-an, J. M. Chang

引用次数: 18

Communication breakdown: analyzing CPU usage in commercial Web workloads 通信中断:分析商业Web工作负载中的CPU使用情况

IEEE International Symposium on - ISPASS Performance Analysis of Systems and Software, 2004 Pub Date : 2004-03-10 DOI: 10.1109/ISPASS.2004.1291351

Jaidev P. Patwardhan, A. Lebeck, Daniel J. Sorin

引用次数: 17

Eccentric and fragile benchmarks 古怪而脆弱的基准

IEEE International Symposium on - ISPASS Performance Analysis of Systems and Software, 2004 Pub Date : 2004-03-10 DOI: 10.1109/ISPASS.2004.1291350

H. Vandierendonck, K. D. Bosschere

{"title":"Eccentric and fragile benchmarks","authors":"H. Vandierendonck, K. D. Bosschere","doi":"10.1109/ISPASS.2004.1291350","DOIUrl":"https://doi.org/10.1109/ISPASS.2004.1291350","url":null,"abstract":"Benchmarks are essential for computer architecture research and performance evaluation. Constructing a good benchmark suite is, however, non-trivial: it must be representative, show different types of behavior and the benchmarks should not be easily tweaked. This paper uses principal components analysis, a statistical data analysis technique, to detect differences in behavior between benchmarks. Two specific types of benchmarks are identified. Eccentric benchmarks have a behavior that differs significantly from the other benchmarks. They are useful to incorporate different types of behavior in a suite. Fragile benchmarks are weak benchmarks: their execution time is determined almost entirely by a single bottleneck. Removing that bottleneck reduces their execution time excessively. This paper argues that fragile benchmarks are not useful and shows how they can be detected by means of workload characterization techniques. These techniques are applied to the SPEC CPU95 and CPU2000 benchmark suites. It is shown that these suites contain both eccentric and fragile benchmarks. The notions of eccentric and fragile benchmarks are important when composing a benchmark suite and to guide the sub-setting of a benchmark suite.","PeriodicalId":188291,"journal":{"name":"IEEE International Symposium on - ISPASS Performance Analysis of Systems and Software, 2004","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115999654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 16

Using cache mapping to improve memory performance handheld devices 使用缓存映射来提高手持设备的内存性能

IEEE International Symposium on - ISPASS Performance Analysis of Systems and Software, 2004 Pub Date : 2004-03-10 DOI: 10.1109/ISPASS.2004.1291362

Rong-Chang Xu, Zhiyuan Li

引用次数: 7