Simulation sampling with live-points

2006 IEEE International Symposium on Performance Analysis of Systems and Software Pub Date : 2006-03-19 DOI:10.1109/ISPASS.2006.1620785

T. Wenisch, Roland E. Wunderlich, B. Falsafi, J. Hoe

{"title":"Simulation sampling with live-points","authors":"T. Wenisch, Roland E. Wunderlich, B. Falsafi, J. Hoe","doi":"10.1109/ISPASS.2006.1620785","DOIUrl":null,"url":null,"abstract":"Current simulation-sampling techniques construct accurate model state for each measurement by continuously warming large microarchitectural structures (e.g., caches and the branch predictor) while functionally simulating the billions of instructions between measurements. This approach, called functional warming, is the main performance bottleneck of simulation sampling and requires hours of runtime while the detailed simulation of the sample requires only minutes. Existing simulators can avoid functional simulation by jumping directly to particular instruction stream locations with architectural state checkpoints. To replace functional warming, these checkpoints must additionally provide microarchitectural model state that is accurate and reusable across experiments while meeting tight storage constraints. In this paper, we present a simulation-sampling framework that replaces functional warming with live-points without sacrificing accuracy. A live-point stores the bare minimum of functionally-warmed state for accurate simulation of a limited execution window while placing minimal restrictions on microarchitectural configuration. Live-points can be processed in random rather than program order, allowing simulation results and their statistical confidence to be reported while simulations are in progress. Our framework matches the accuracy of prior simulation-sampling techniques (i.e., /spl plusmn/3% error with 99.7% confidence), while estimating the performance of an 8-way out-of-order superscalar processor running SPEC CPU2000 in 91 seconds per benchmark, on average, using a 12 GB live-point library.","PeriodicalId":369192,"journal":{"name":"2006 IEEE International Symposium on Performance Analysis of Systems and Software","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"58","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2006 IEEE International Symposium on Performance Analysis of Systems and Software","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISPASS.2006.1620785","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 58

Abstract

Current simulation-sampling techniques construct accurate model state for each measurement by continuously warming large microarchitectural structures (e.g., caches and the branch predictor) while functionally simulating the billions of instructions between measurements. This approach, called functional warming, is the main performance bottleneck of simulation sampling and requires hours of runtime while the detailed simulation of the sample requires only minutes. Existing simulators can avoid functional simulation by jumping directly to particular instruction stream locations with architectural state checkpoints. To replace functional warming, these checkpoints must additionally provide microarchitectural model state that is accurate and reusable across experiments while meeting tight storage constraints. In this paper, we present a simulation-sampling framework that replaces functional warming with live-points without sacrificing accuracy. A live-point stores the bare minimum of functionally-warmed state for accurate simulation of a limited execution window while placing minimal restrictions on microarchitectural configuration. Live-points can be processed in random rather than program order, allowing simulation results and their statistical confidence to be reported while simulations are in progress. Our framework matches the accuracy of prior simulation-sampling techniques (i.e., /spl plusmn/3% error with 99.7% confidence), while estimating the performance of an 8-way out-of-order superscalar processor running SPEC CPU2000 in 91 seconds per benchmark, on average, using a 12 GB live-point library.

查看原文本刊更多论文

用活点模拟采样

当前的模拟采样技术通过不断加热大型微架构结构(例如缓存和分支预测器)来为每次测量构建精确的模型状态，同时在测量之间功能模拟数十亿条指令。这种方法称为功能升温，是模拟采样的主要性能瓶颈，需要数小时的运行时间，而样本的详细模拟只需要几分钟。现有的模拟器可以通过直接跳转到具有体系结构状态检查点的特定指令流位置来避免功能模拟。为了取代功能性升温，这些检查点必须额外提供精确的微架构模型状态，并在满足严格的存储约束的情况下跨实验可重用。在本文中，我们提出了一个模拟采样框架，在不牺牲精度的情况下，用活点代替功能变暖。活动点存储最小的功能预热状态，用于精确模拟有限的执行窗口，同时对微架构配置施加最小的限制。实时点可以随机处理，而不是按程序顺序处理，允许在模拟进行时报告模拟结果及其统计置信度。我们的框架与之前的模拟采样技术的准确性相匹配(即/spl + usmn/3%的误差，99.7%的置信度)，同时使用12 GB的活点库估计运行SPEC CPU2000的8路无序标量处理器在每个基准测试平均91秒内的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2006 IEEE International Symposium on Performance Analysis of Systems and Software

自引率

0.00%

发文量