An Active Learning Method for Empirical Modeling in Performance Tuning

2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS) Pub Date : 2020-05-01 DOI:10.1109/IPDPS47924.2020.00034

Jiepeng Zhang, Jingwei Sun, Wenju Zhou, Guangzhong Sun

{"title":"An Active Learning Method for Empirical Modeling in Performance Tuning","authors":"Jiepeng Zhang, Jingwei Sun, Wenju Zhou, Guangzhong Sun","doi":"10.1109/IPDPS47924.2020.00034","DOIUrl":null,"url":null,"abstract":"Tuning performance of scientific applications is a challenging problem since performance can be a complicated nonlinear function with respect to application parameters. Empirical performance modeling is a useful approach to approximate the function and enable efficient heuristic methods to find sub-optimal parameter configurations. However, empirical performance modeling requires a large number of samples from the parameter space, which is resource and time-consuming. To address this issue, existing work based on active learning techniques proposed PBU Sampling method considering performance before uncertainty, which iteratively performs performance biased sampling to model the high-performance subspace instead of the entire space before evaluating the most uncertain samples to reduce redundancy. Compared with uniformly random sampling, this approach can reduce the number of samples, but it still involves redundant sampling that potentially can be improved.We propose a novel active learning based method to exploit the information of evaluated samples and explore possible high-performance parameter configurations. Specifically, we adopt a Performance Weighted Uncertainty (PWU) sampling strategy to identify the configurations with either high performance or high uncertainty and determine which ones are selected for evaluation. To evaluate the effectiveness of our proposed method, we construct random forest to predict the execution time of kernels from SPAPT suite and two typical scientific parallel applications kripke, hypre. Experimental results show that compared with existing methods, our proposed method can reduce the cost of modeling by up to 21x and 3x on average meanwhile hold the same prediction accuracy.","PeriodicalId":6805,"journal":{"name":"2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","volume":"5 1","pages":"244-253"},"PeriodicalIF":0.0000,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPS47924.2020.00034","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

Abstract

Tuning performance of scientific applications is a challenging problem since performance can be a complicated nonlinear function with respect to application parameters. Empirical performance modeling is a useful approach to approximate the function and enable efficient heuristic methods to find sub-optimal parameter configurations. However, empirical performance modeling requires a large number of samples from the parameter space, which is resource and time-consuming. To address this issue, existing work based on active learning techniques proposed PBU Sampling method considering performance before uncertainty, which iteratively performs performance biased sampling to model the high-performance subspace instead of the entire space before evaluating the most uncertain samples to reduce redundancy. Compared with uniformly random sampling, this approach can reduce the number of samples, but it still involves redundant sampling that potentially can be improved.We propose a novel active learning based method to exploit the information of evaluated samples and explore possible high-performance parameter configurations. Specifically, we adopt a Performance Weighted Uncertainty (PWU) sampling strategy to identify the configurations with either high performance or high uncertainty and determine which ones are selected for evaluation. To evaluate the effectiveness of our proposed method, we construct random forest to predict the execution time of kernels from SPAPT suite and two typical scientific parallel applications kripke, hypre. Experimental results show that compared with existing methods, our proposed method can reduce the cost of modeling by up to 21x and 3x on average meanwhile hold the same prediction accuracy.

查看原文本刊更多论文

性能调优中经验建模的主动学习方法

科学应用程序的性能调优是一个具有挑战性的问题，因为性能可能是与应用程序参数有关的复杂非线性函数。经验性能建模是一种有用的方法来近似函数，并使有效的启发式方法能够找到次优参数配置。然而，经验性能建模需要从参数空间中获取大量样本，这既耗费资源又耗时。为了解决这一问题，已有的基于主动学习技术的工作提出了先考虑性能后考虑不确定性的PBU采样方法，该方法在评估最不确定的样本之前，迭代地执行性能偏差采样来建模高性能子空间，而不是整个空间，以减少冗余。与均匀随机抽样相比，这种方法可以减少样本数量，但仍然涉及冗余抽样，这是可以改进的。我们提出了一种新的基于主动学习的方法来利用评估样本的信息并探索可能的高性能参数配置。具体来说，我们采用性能加权不确定性(PWU)采样策略来识别具有高性能或高不确定性的配置，并确定选择哪些配置进行评估。为了评估我们提出的方法的有效性，我们构建了随机森林来预测SPAPT套件和两个典型的科学并行应用程序(kripke, hyperpre, kripke)的核执行时间。实验结果表明，与现有方法相比，我们提出的方法在保持相同预测精度的同时，平均可将建模成本降低21倍和3倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS)

自引率

0.00%

发文量