N. Wright, Shava Smallen, C. Olschanowsky, J. Hayes, A. Snavely
{"title":"Measuring and Understanding Variation in Benchmark Performance","authors":"N. Wright, Shava Smallen, C. Olschanowsky, J. Hayes, A. Snavely","doi":"10.1109/HPCMP-UGC.2009.72","DOIUrl":null,"url":null,"abstract":"Runtime irreproducibility complicates application performance evaluation on today’s high performance computers. Performance can vary significantly between seemingly identical runs; this presents a challenge to benchmarking as well as a user, who is trying to determine whether the change they made to their code is an actual improvement. In order to gain a better understanding of this phenomenon, we measure the runtime variation of two applications, PARAllel Total Energy Code (PARATEC) and Weather Research and Forecasting (WRF), on three different machines. Key associated metrics are also recorded. The data is then used to 1) quantify the magnitude and distribution of the variations and 2) gain an understanding as why the variations occur. Using our lightweight framework, Integrated Performance Monitoring (IPM), to understand the performance characteristics of individual runs, and the Inca framework to automate the procedure measurements were collected over a month’s time. The results indicate that performance can vary up to 25% and is almost always due to contention for network resources. We also found that the variation differs between machines and is almost always greater on machines with lower performing networks.","PeriodicalId":268639,"journal":{"name":"2009 DoD High Performance Computing Modernization Program Users Group Conference","volume":"408 2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"27","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 DoD High Performance Computing Modernization Program Users Group Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPCMP-UGC.2009.72","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 27
Abstract
Runtime irreproducibility complicates application performance evaluation on today’s high performance computers. Performance can vary significantly between seemingly identical runs; this presents a challenge to benchmarking as well as a user, who is trying to determine whether the change they made to their code is an actual improvement. In order to gain a better understanding of this phenomenon, we measure the runtime variation of two applications, PARAllel Total Energy Code (PARATEC) and Weather Research and Forecasting (WRF), on three different machines. Key associated metrics are also recorded. The data is then used to 1) quantify the magnitude and distribution of the variations and 2) gain an understanding as why the variations occur. Using our lightweight framework, Integrated Performance Monitoring (IPM), to understand the performance characteristics of individual runs, and the Inca framework to automate the procedure measurements were collected over a month’s time. The results indicate that performance can vary up to 25% and is almost always due to contention for network resources. We also found that the variation differs between machines and is almost always greater on machines with lower performing networks.
运行时不可重现性使当今高性能计算机上的应用程序性能评估变得复杂。在看似相同的运行之间,性能可能会有很大差异;这对基准测试和用户都提出了挑战,因为用户试图确定他们对代码所做的更改是否是实际的改进。为了更好地理解这一现象,我们在三台不同的机器上测量了两个应用程序的运行时变化,PARAllel Total Energy Code (PARATEC)和Weather Research and Forecasting (WRF)。还记录了关键的相关指标。然后使用这些数据1)量化变化的幅度和分布,2)了解变化发生的原因。使用我们的轻量级框架集成性能监控(IPM)来了解单个运行的性能特征,使用Inca框架来自动收集一个多月的过程测量数据。结果表明,性能变化可能高达25%,并且几乎总是由于对网络资源的争用。我们还发现,机器之间的差异是不同的,而且在网络性能较差的机器上,差异几乎总是更大。