Applying Supervised Learning to the Static Prediction of Locality-Pattern Complexity in Scientific Code

2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA) Pub Date : 2018-12-01 DOI:10.1109/ICMLA.2018.00162

Nasser Alsaedi, S. Carr, A. Fong

{"title":"Applying Supervised Learning to the Static Prediction of Locality-Pattern Complexity in Scientific Code","authors":"Nasser Alsaedi, S. Carr, A. Fong","doi":"10.1109/ICMLA.2018.00162","DOIUrl":null,"url":null,"abstract":"On modern computer systems, the performance of an application depends largely on its locality. Current compiler static locality analysis has limited applicability due to limited run-time information. By instrumenting and running programs, training-based locality analysis is able to predict the locality of an application based on the size of the input data accurately; however, it is costly in terms of time and space. In this paper, we combine source-code analysis with training-based locality analysis to construct a supervised-learning model parameterized only by the source code properties. This model is the first to be able to predict the upper bound of data reuse change (locality pattern complexity) at compile time for loop nests in array-based programs without the need to instrument and run the program. The result is the ability to predict how virtual memory usage grows as a function of the input size efficiently. We have evaluated our model using array-based code as input to a variety of classification algorithms. These algorithms include Naive Bayes, Decision tree, and Support Vector Machine (SVM). Our experiments show that SVM outperforms the other classifiers with 97% precision, a 97% true positive rate and a 1% false positive rate. We are able to predict the growth rate of memory usage in unseen scientific code accurately without the need to instrument and run the program. This work represents a significant step in developing an accurate static memory usage predictor for use in Virtual Machines (VMs) in cloud data centers.","PeriodicalId":6533,"journal":{"name":"2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"101 1","pages":"995-1000"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA.2018.00162","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

On modern computer systems, the performance of an application depends largely on its locality. Current compiler static locality analysis has limited applicability due to limited run-time information. By instrumenting and running programs, training-based locality analysis is able to predict the locality of an application based on the size of the input data accurately; however, it is costly in terms of time and space. In this paper, we combine source-code analysis with training-based locality analysis to construct a supervised-learning model parameterized only by the source code properties. This model is the first to be able to predict the upper bound of data reuse change (locality pattern complexity) at compile time for loop nests in array-based programs without the need to instrument and run the program. The result is the ability to predict how virtual memory usage grows as a function of the input size efficiently. We have evaluated our model using array-based code as input to a variety of classification algorithms. These algorithms include Naive Bayes, Decision tree, and Support Vector Machine (SVM). Our experiments show that SVM outperforms the other classifiers with 97% precision, a 97% true positive rate and a 1% false positive rate. We are able to predict the growth rate of memory usage in unseen scientific code accurately without the need to instrument and run the program. This work represents a significant step in developing an accurate static memory usage predictor for use in Virtual Machines (VMs) in cloud data centers.

查看原文本刊更多论文

将监督学习应用于科学码中位置模式复杂度的静态预测

在现代计算机系统上，应用程序的性能在很大程度上取决于它的位置。由于运行时信息有限，当前编译器静态局部性分析的适用性有限。通过对程序的检测和运行，基于训练的局部性分析能够根据输入数据的大小准确地预测应用程序的局部性;然而，在时间和空间方面，这是昂贵的。本文将源代码分析与基于训练的局部性分析相结合，构建了一个仅由源代码属性参数化的监督学习模型。该模型是第一个能够在编译时预测基于数组的程序中的循环巢的数据重用变化(局部模式复杂性)上限的模型，而不需要检测和运行程序。其结果是能够预测虚拟内存使用如何作为输入大小的函数有效地增长。我们使用基于数组的代码作为各种分类算法的输入来评估我们的模型。这些算法包括朴素贝叶斯、决策树和支持向量机(SVM)。我们的实验表明，SVM的准确率为97%，真阳性率为97%，假阳性率为1%，优于其他分类器。我们能够在不可见的科学代码中准确地预测内存使用的增长率，而不需要检测和运行程序。这项工作在开发用于云数据中心中的虚拟机(vm)的准确静态内存使用预测器方面迈出了重要的一步。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA)

自引率

0.00%

发文量