Run-to-run Variability on Xeon Phi based Cray XC Systems

Sudheer Chunduri, K. Harms, Scott Parker, V. Morozov, Samuel Oshin, N. Cherukuri, Kalyan Kumaran
{"title":"Run-to-run Variability on Xeon Phi based Cray XC Systems","authors":"Sudheer Chunduri, K. Harms, Scott Parker, V. Morozov, Samuel Oshin, N. Cherukuri, Kalyan Kumaran","doi":"10.1145/3126908.3126926","DOIUrl":null,"url":null,"abstract":"The increasing complexity of HPC systems has introduced new sources of variability, which can contribute to significant differences in run-to-run performance of applications. With components at various levels of the system contributing variability, application developers and system users are now faced with the difficult task of running and tuning their applications in an environment where run-to-run performance measurements can vary by as much as a factor of two to three. In this study, we classify, quantify, and present ways to mitigate the sources of run-to-run variability on Cray XC systems with Intel Xeon Phi processors and a dragonfly interconnect. We further demonstrate that the code-tuning performance observed in a variability-mitigating environment correlates with the performance observed in production running conditions. CCS CONCEPTS • General and reference $\\rightarrow$ Performance; • Networks $\\rightarrow$ Network performance analysis; • Hardware $\\longrightarrow$ Process, voltage and temperature variations;","PeriodicalId":204241,"journal":{"name":"SC17: International Conference for High Performance Computing, Networking, Storage and Analysis","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"60","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"SC17: International Conference for High Performance Computing, Networking, Storage and Analysis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3126908.3126926","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 60

Abstract

The increasing complexity of HPC systems has introduced new sources of variability, which can contribute to significant differences in run-to-run performance of applications. With components at various levels of the system contributing variability, application developers and system users are now faced with the difficult task of running and tuning their applications in an environment where run-to-run performance measurements can vary by as much as a factor of two to three. In this study, we classify, quantify, and present ways to mitigate the sources of run-to-run variability on Cray XC systems with Intel Xeon Phi processors and a dragonfly interconnect. We further demonstrate that the code-tuning performance observed in a variability-mitigating environment correlates with the performance observed in production running conditions. CCS CONCEPTS • General and reference $\rightarrow$ Performance; • Networks $\rightarrow$ Network performance analysis; • Hardware $\longrightarrow$ Process, voltage and temperature variations;
基于Xeon Phi的Cray XC系统的运行间可变性
高性能计算系统日益复杂,引入了新的可变性来源,这可能导致应用程序运行性能的显著差异。由于系统各个级别的组件都具有可变性,因此应用程序开发人员和系统用户现在面临着在运行到运行的性能度量可能相差两到三倍的环境中运行和调优应用程序的困难任务。在本研究中,我们对带有Intel Xeon Phi处理器和蜻蜓互连的Cray XC系统进行了分类、量化并提出了减轻运行间可变性来源的方法。我们进一步证明,在减少可变性的环境中观察到的代码调优性能与在生产运行条件中观察到的性能是相关的。CCS概念•一般和参考$\右箭头$性能;•网络$\右箭头$网络性能分析;•硬件$\ longightarrow $进程,电压和温度变化;
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信