Sudheer Chunduri, K. Harms, Scott Parker, V. Morozov, Samuel Oshin, N. Cherukuri, Kalyan Kumaran
{"title":"基于Xeon Phi的Cray XC系统的运行间可变性","authors":"Sudheer Chunduri, K. Harms, Scott Parker, V. Morozov, Samuel Oshin, N. Cherukuri, Kalyan Kumaran","doi":"10.1145/3126908.3126926","DOIUrl":null,"url":null,"abstract":"The increasing complexity of HPC systems has introduced new sources of variability, which can contribute to significant differences in run-to-run performance of applications. With components at various levels of the system contributing variability, application developers and system users are now faced with the difficult task of running and tuning their applications in an environment where run-to-run performance measurements can vary by as much as a factor of two to three. In this study, we classify, quantify, and present ways to mitigate the sources of run-to-run variability on Cray XC systems with Intel Xeon Phi processors and a dragonfly interconnect. We further demonstrate that the code-tuning performance observed in a variability-mitigating environment correlates with the performance observed in production running conditions. CCS CONCEPTS • General and reference $\\rightarrow$ Performance; • Networks $\\rightarrow$ Network performance analysis; • Hardware $\\longrightarrow$ Process, voltage and temperature variations;","PeriodicalId":204241,"journal":{"name":"SC17: International Conference for High Performance Computing, Networking, Storage and Analysis","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"60","resultStr":"{\"title\":\"Run-to-run Variability on Xeon Phi based Cray XC Systems\",\"authors\":\"Sudheer Chunduri, K. Harms, Scott Parker, V. Morozov, Samuel Oshin, N. Cherukuri, Kalyan Kumaran\",\"doi\":\"10.1145/3126908.3126926\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The increasing complexity of HPC systems has introduced new sources of variability, which can contribute to significant differences in run-to-run performance of applications. With components at various levels of the system contributing variability, application developers and system users are now faced with the difficult task of running and tuning their applications in an environment where run-to-run performance measurements can vary by as much as a factor of two to three. In this study, we classify, quantify, and present ways to mitigate the sources of run-to-run variability on Cray XC systems with Intel Xeon Phi processors and a dragonfly interconnect. We further demonstrate that the code-tuning performance observed in a variability-mitigating environment correlates with the performance observed in production running conditions. CCS CONCEPTS • General and reference $\\\\rightarrow$ Performance; • Networks $\\\\rightarrow$ Network performance analysis; • Hardware $\\\\longrightarrow$ Process, voltage and temperature variations;\",\"PeriodicalId\":204241,\"journal\":{\"name\":\"SC17: International Conference for High Performance Computing, Networking, Storage and Analysis\",\"volume\":\"37 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-11-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"60\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"SC17: International Conference for High Performance Computing, Networking, Storage and Analysis\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3126908.3126926\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"SC17: International Conference for High Performance Computing, Networking, Storage and Analysis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3126908.3126926","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Run-to-run Variability on Xeon Phi based Cray XC Systems
The increasing complexity of HPC systems has introduced new sources of variability, which can contribute to significant differences in run-to-run performance of applications. With components at various levels of the system contributing variability, application developers and system users are now faced with the difficult task of running and tuning their applications in an environment where run-to-run performance measurements can vary by as much as a factor of two to three. In this study, we classify, quantify, and present ways to mitigate the sources of run-to-run variability on Cray XC systems with Intel Xeon Phi processors and a dragonfly interconnect. We further demonstrate that the code-tuning performance observed in a variability-mitigating environment correlates with the performance observed in production running conditions. CCS CONCEPTS • General and reference $\rightarrow$ Performance; • Networks $\rightarrow$ Network performance analysis; • Hardware $\longrightarrow$ Process, voltage and temperature variations;