{"title":"MPI-focused Tracing with OTFX: An MPI-aware In-memory Event Tracing Extension to the Open Trace Format 2","authors":"M. Wagner, J. Doleschal, A. Knüpfer","doi":"10.1145/2802658.2802664","DOIUrl":null,"url":null,"abstract":"Performance analysis tools are more than ever inevitable to develop applications that utilize the enormous computing resources of high performance computing (HPC) systems. In event-based performance analysis the amount of collected data is one of the most urgent challenges. The resulting measurement bias caused by uncoordinated intermediate memory buffer flushes in the monitoring tool can render a meaningful analysis of the parallel behavior impossible. In this paper we address the impact of intermediate memory buffer flushes and present a method to avoid file interaction in the monitoring tool entirely. We propose an MPI-focused tracing approach that provides the complete MPI communication behavior and adapts the remaining application events to an amount that fits into a single memory buffer. We demonstrate the capabilities of our method with an MPI-focused prototype implementation of OTFX, based on the Open Trace Format 2, a state-of-the-art Open Source event tracing library used by the performance analysis tools Vampir, Scalasca, and Tau. In a comparison to OTF2 based on seven applications from different scientific domains, our prototype introduces in average 5.1% less overhead and reduces the trace size up to three orders of magnitude.","PeriodicalId":365272,"journal":{"name":"Proceedings of the 22nd European MPI Users' Group Meeting","volume":"140 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 22nd European MPI Users' Group Meeting","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2802658.2802664","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
Performance analysis tools are more than ever inevitable to develop applications that utilize the enormous computing resources of high performance computing (HPC) systems. In event-based performance analysis the amount of collected data is one of the most urgent challenges. The resulting measurement bias caused by uncoordinated intermediate memory buffer flushes in the monitoring tool can render a meaningful analysis of the parallel behavior impossible. In this paper we address the impact of intermediate memory buffer flushes and present a method to avoid file interaction in the monitoring tool entirely. We propose an MPI-focused tracing approach that provides the complete MPI communication behavior and adapts the remaining application events to an amount that fits into a single memory buffer. We demonstrate the capabilities of our method with an MPI-focused prototype implementation of OTFX, based on the Open Trace Format 2, a state-of-the-art Open Source event tracing library used by the performance analysis tools Vampir, Scalasca, and Tau. In a comparison to OTF2 based on seven applications from different scientific domains, our prototype introduces in average 5.1% less overhead and reduces the trace size up to three orders of magnitude.
对于开发利用高性能计算(HPC)系统的巨大计算资源的应用程序,性能分析工具比以往任何时候都更加不可避免。在基于事件的性能分析中,收集的数据量是最紧迫的挑战之一。监视工具中不协调的中间内存缓冲区刷新导致的测量偏差可能导致无法对并行行为进行有意义的分析。在本文中,我们解决了中间内存缓冲区刷新的影响,并提出了一种在监控工具中完全避免文件交互的方法。我们提出了一种以MPI为中心的跟踪方法,该方法提供了完整的MPI通信行为,并将剩余的应用程序事件调整为适合单个内存缓冲区的数量。我们用一个以mpi为中心的OTFX原型实现演示了我们的方法的能力,该原型实现基于Open Trace Format 2,这是一个由性能分析工具Vampir、Scalasca和Tau使用的最先进的开源事件跟踪库。与基于来自不同科学领域的七个应用程序的OTF2相比,我们的原型平均减少了5.1%的开销,并将跟踪大小减少了三个数量级。