测量程序相似性:用SPEC CPU基准套件进行实验

IEEE International Symposium on Performance Analysis of Systems and Software, 2005. ISPASS 2005. Pub Date : 2005-03-20 DOI:10.1109/ISPASS.2005.1430555

Aashish Phansalkar, A. Joshi, L. Eeckhout, L. John

{"title":"测量程序相似性:用SPEC CPU基准套件进行实验","authors":"Aashish Phansalkar, A. Joshi, L. Eeckhout, L. John","doi":"10.1109/ISPASS.2005.1430555","DOIUrl":null,"url":null,"abstract":"It is essential that a subset of benchmark programs used to evaluate an architectural enhancement, is well distributed within the target workload space rather than clustered in specific areas. Past efforts for identifying subsets have primarily relied on using microarchitecture-dependent metrics of program performance, such as cycles per instruction and cache miss-rate. The shortcoming of this technique is that the results could be biased by the idiosyncrasies of the chosen configurations. The objective of this paper is to present a methodology to measure similarity of programs based on their inherent microarchitecture-independent characteristics which will make the results applicable to any microarchitecture. We apply our methodology to the SPEC CPU2000 benchmark suite and demonstrate that a subset of 8 programs can be used to effectively represent the entire suite. We validate the usefulness of this subset by using it to estimate the average IPC and L1 data cache miss-rate of the entire suite. The average IPC of 8-way and 16-way issue superscalar processor configurations could be estimated with 3.9% and 4.4% error respectively. This methodology is applicable not only to find subsets from a benchmark suite, but also to identify programs for a benchmark suite from a list of potential candidates. Studying the four generations of SPEC CPU benchmark suites, we find that other than a dramatic increase in the dynamic instruction count and increasingly poor temporal data locality, the inherent program characteristics have more or less remained the same","PeriodicalId":230669,"journal":{"name":"IEEE International Symposium on Performance Analysis of Systems and Software, 2005. ISPASS 2005.","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2005-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"172","resultStr":"{\"title\":\"Measuring Program Similarity: Experiments with SPEC CPU Benchmark Suites\",\"authors\":\"Aashish Phansalkar, A. Joshi, L. Eeckhout, L. John\",\"doi\":\"10.1109/ISPASS.2005.1430555\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"It is essential that a subset of benchmark programs used to evaluate an architectural enhancement, is well distributed within the target workload space rather than clustered in specific areas. Past efforts for identifying subsets have primarily relied on using microarchitecture-dependent metrics of program performance, such as cycles per instruction and cache miss-rate. The shortcoming of this technique is that the results could be biased by the idiosyncrasies of the chosen configurations. The objective of this paper is to present a methodology to measure similarity of programs based on their inherent microarchitecture-independent characteristics which will make the results applicable to any microarchitecture. We apply our methodology to the SPEC CPU2000 benchmark suite and demonstrate that a subset of 8 programs can be used to effectively represent the entire suite. We validate the usefulness of this subset by using it to estimate the average IPC and L1 data cache miss-rate of the entire suite. The average IPC of 8-way and 16-way issue superscalar processor configurations could be estimated with 3.9% and 4.4% error respectively. This methodology is applicable not only to find subsets from a benchmark suite, but also to identify programs for a benchmark suite from a list of potential candidates. Studying the four generations of SPEC CPU benchmark suites, we find that other than a dramatic increase in the dynamic instruction count and increasingly poor temporal data locality, the inherent program characteristics have more or less remained the same\",\"PeriodicalId\":230669,\"journal\":{\"name\":\"IEEE International Symposium on Performance Analysis of Systems and Software, 2005. ISPASS 2005.\",\"volume\":\"14 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2005-03-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"172\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE International Symposium on Performance Analysis of Systems and Software, 2005. ISPASS 2005.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISPASS.2005.1430555\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE International Symposium on Performance Analysis of Systems and Software, 2005. ISPASS 2005.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISPASS.2005.1430555","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 172

摘要

用于评估体系结构增强的基准程序子集必须在目标工作负载空间中很好地分布，而不是聚集在特定区域中。过去识别子集的工作主要依赖于使用与微体系结构相关的程序性能指标，例如每条指令的周期和缓存失误率。这种技术的缺点是结果可能会受到所选配置的特性的影响。本文的目的是提出一种基于程序固有的与微体系结构无关的特征来衡量程序相似性的方法，使结果适用于任何微体系结构。我们将我们的方法应用于SPEC CPU2000基准套件，并证明了8个程序的子集可以用来有效地表示整个套件。我们通过使用它来估计整个套件的平均IPC和L1数据缓存失误率来验证该子集的有效性。8路和16路超标量处理器配置的平均IPC估计误差分别为3.9%和4.4%。这种方法不仅适用于从基准测试套件中找到子集，而且还适用于从潜在候选列表中确定基准测试套件的程序。通过对四代SPEC CPU基准测试套件的研究，我们发现，除了动态指令数量急剧增加和时间数据局部性越来越差之外，程序的固有特征或多或少保持不变

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Measuring Program Similarity: Experiments with SPEC CPU Benchmark Suites

It is essential that a subset of benchmark programs used to evaluate an architectural enhancement, is well distributed within the target workload space rather than clustered in specific areas. Past efforts for identifying subsets have primarily relied on using microarchitecture-dependent metrics of program performance, such as cycles per instruction and cache miss-rate. The shortcoming of this technique is that the results could be biased by the idiosyncrasies of the chosen configurations. The objective of this paper is to present a methodology to measure similarity of programs based on their inherent microarchitecture-independent characteristics which will make the results applicable to any microarchitecture. We apply our methodology to the SPEC CPU2000 benchmark suite and demonstrate that a subset of 8 programs can be used to effectively represent the entire suite. We validate the usefulness of this subset by using it to estimate the average IPC and L1 data cache miss-rate of the entire suite. The average IPC of 8-way and 16-way issue superscalar processor configurations could be estimated with 3.9% and 4.4% error respectively. This methodology is applicable not only to find subsets from a benchmark suite, but also to identify programs for a benchmark suite from a list of potential candidates. Studying the four generations of SPEC CPU benchmark suites, we find that other than a dramatic increase in the dynamic instruction count and increasingly poor temporal data locality, the inherent program characteristics have more or less remained the same

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE International Symposium on Performance Analysis of Systems and Software, 2005. ISPASS 2005.

自引率

0.00%

发文量