{"title":"Applying Supervised Learning to the Static Prediction of Locality-Pattern Complexity in Scientific Code","authors":"Nasser Alsaedi, S. Carr, A. Fong","doi":"10.1109/ICMLA.2018.00162","DOIUrl":null,"url":null,"abstract":"On modern computer systems, the performance of an application depends largely on its locality. Current compiler static locality analysis has limited applicability due to limited run-time information. By instrumenting and running programs, training-based locality analysis is able to predict the locality of an application based on the size of the input data accurately; however, it is costly in terms of time and space. In this paper, we combine source-code analysis with training-based locality analysis to construct a supervised-learning model parameterized only by the source code properties. This model is the first to be able to predict the upper bound of data reuse change (locality pattern complexity) at compile time for loop nests in array-based programs without the need to instrument and run the program. The result is the ability to predict how virtual memory usage grows as a function of the input size efficiently. We have evaluated our model using array-based code as input to a variety of classification algorithms. These algorithms include Naive Bayes, Decision tree, and Support Vector Machine (SVM). Our experiments show that SVM outperforms the other classifiers with 97% precision, a 97% true positive rate and a 1% false positive rate. We are able to predict the growth rate of memory usage in unseen scientific code accurately without the need to instrument and run the program. This work represents a significant step in developing an accurate static memory usage predictor for use in Virtual Machines (VMs) in cloud data centers.","PeriodicalId":6533,"journal":{"name":"2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"101 1","pages":"995-1000"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA.2018.00162","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
On modern computer systems, the performance of an application depends largely on its locality. Current compiler static locality analysis has limited applicability due to limited run-time information. By instrumenting and running programs, training-based locality analysis is able to predict the locality of an application based on the size of the input data accurately; however, it is costly in terms of time and space. In this paper, we combine source-code analysis with training-based locality analysis to construct a supervised-learning model parameterized only by the source code properties. This model is the first to be able to predict the upper bound of data reuse change (locality pattern complexity) at compile time for loop nests in array-based programs without the need to instrument and run the program. The result is the ability to predict how virtual memory usage grows as a function of the input size efficiently. We have evaluated our model using array-based code as input to a variety of classification algorithms. These algorithms include Naive Bayes, Decision tree, and Support Vector Machine (SVM). Our experiments show that SVM outperforms the other classifiers with 97% precision, a 97% true positive rate and a 1% false positive rate. We are able to predict the growth rate of memory usage in unseen scientific code accurately without the need to instrument and run the program. This work represents a significant step in developing an accurate static memory usage predictor for use in Virtual Machines (VMs) in cloud data centers.