Chuntao Jiang, Zhibin Yu, Hai Jin, Xiaofei Liao, L. Eeckhout, Yonggang Zeng, Chengzhong Xu
{"title":"Shorter On-Line Warmup for Sampled Simulation of Multi-threaded Applications","authors":"Chuntao Jiang, Zhibin Yu, Hai Jin, Xiaofei Liao, L. Eeckhout, Yonggang Zeng, Chengzhong Xu","doi":"10.1109/ICPP.2015.44","DOIUrl":null,"url":null,"abstract":"Warm up is a crucial issue in sampled micro architectural simulation to avoid performance bias by constructing accurate states for micro-architectural structures before each sampling unit. Not until very recently have researchers proposed Time-Based Sampling (TBS) for the sampled simulation of multi-threaded applications. However, warm up in TBS is challenging and complicated, because (i) full functional warm up in TBS causes very high overhead, limiting overall simulation speed, (ii) traditional adaptive functional warm up for sampling single-threaded applications cannot be readily applied to TBS, and (iii) check pointing is inflexible (even invalid) due to the huge storage requirements and the variations across different runs for multi-threaded applications. In this work, we propose Shorter On-Line (SOL) warm up, which employs a two-stage strategy, using 'prime' warm up in the first stage, and an extended 'No-State-Loss (NSL)' method in the second stage. SOL is a single-pass, on-line warm up technique that addresses the warm up challenges posed in TBS in parallel simulators. SOL is highly accurate and efficient, providing a good trade-off between simulation accuracy and speed, and is easily deployed to different TBS techniques. For the PARSEC benchmarks on a simulated 8-core system, two state-of-the-art TBS techniques with SOL warm up provide a 7.2× and 37× simulation speedup over detailed simulation, respectively, compared to 3.1× and 4.5× under full warm up. SOL sacrifices only 0.3% in absolute execution time prediction accuracy on average.","PeriodicalId":423007,"journal":{"name":"2015 44th International Conference on Parallel Processing","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 44th International Conference on Parallel Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPP.2015.44","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Warm up is a crucial issue in sampled micro architectural simulation to avoid performance bias by constructing accurate states for micro-architectural structures before each sampling unit. Not until very recently have researchers proposed Time-Based Sampling (TBS) for the sampled simulation of multi-threaded applications. However, warm up in TBS is challenging and complicated, because (i) full functional warm up in TBS causes very high overhead, limiting overall simulation speed, (ii) traditional adaptive functional warm up for sampling single-threaded applications cannot be readily applied to TBS, and (iii) check pointing is inflexible (even invalid) due to the huge storage requirements and the variations across different runs for multi-threaded applications. In this work, we propose Shorter On-Line (SOL) warm up, which employs a two-stage strategy, using 'prime' warm up in the first stage, and an extended 'No-State-Loss (NSL)' method in the second stage. SOL is a single-pass, on-line warm up technique that addresses the warm up challenges posed in TBS in parallel simulators. SOL is highly accurate and efficient, providing a good trade-off between simulation accuracy and speed, and is easily deployed to different TBS techniques. For the PARSEC benchmarks on a simulated 8-core system, two state-of-the-art TBS techniques with SOL warm up provide a 7.2× and 37× simulation speedup over detailed simulation, respectively, compared to 3.1× and 4.5× under full warm up. SOL sacrifices only 0.3% in absolute execution time prediction accuracy on average.