Tarek Ramadan, Nathan Pinnow, Chase Phelps, Jayaraman J. Thiagarajan, Tanzima Z. Islam
{"title":"Structure-Aware Representation Learning for Effective Performance Prediction","authors":"Tarek Ramadan, Nathan Pinnow, Chase Phelps, Jayaraman J. Thiagarajan, Tanzima Z. Islam","doi":"10.1002/cpe.70046","DOIUrl":null,"url":null,"abstract":"<p>Application performance is a function of several unknowns stemming from the interactions between the application, runtime, OS, and underlying hardware, making it challenging to model performance using deep learning techniques, especially without a large labeled dataset. Collecting such labeled longitudinal datasets can take weeks. Intuitively, developers could save analysis time during code development by taking a comparative approach between multiple applications. However, the unknown dynamic interactions between applications and execution environments make it difficult for deep learning-based models to predict the performance of new applications. In this paper, we address these problems by presenting a labeled dataset for the community and taking a comparative analysis approach to explore the source code differences between different correct implementations of the same problem. This paper assesses the feasibility of using purely static information, for example, Abstract Syntax Tree (AST), of applications to predict performance change based on code structure. We evaluate several deep learning-based representation learning techniques for source code and propose an architecture for the tree-based Long Short-Term Memory (LSTM) models to discover latent representations for a source code's hierarchical structure. We demonstrate that our proposed architecture enables feed-forward predictive models to predict change in performance using source code with up to 84% accuracy.</p>","PeriodicalId":55214,"journal":{"name":"Concurrency and Computation-Practice & Experience","volume":"37 9-11","pages":""},"PeriodicalIF":1.5000,"publicationDate":"2025-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cpe.70046","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Concurrency and Computation-Practice & Experience","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/cpe.70046","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0
Abstract
Application performance is a function of several unknowns stemming from the interactions between the application, runtime, OS, and underlying hardware, making it challenging to model performance using deep learning techniques, especially without a large labeled dataset. Collecting such labeled longitudinal datasets can take weeks. Intuitively, developers could save analysis time during code development by taking a comparative approach between multiple applications. However, the unknown dynamic interactions between applications and execution environments make it difficult for deep learning-based models to predict the performance of new applications. In this paper, we address these problems by presenting a labeled dataset for the community and taking a comparative analysis approach to explore the source code differences between different correct implementations of the same problem. This paper assesses the feasibility of using purely static information, for example, Abstract Syntax Tree (AST), of applications to predict performance change based on code structure. We evaluate several deep learning-based representation learning techniques for source code and propose an architecture for the tree-based Long Short-Term Memory (LSTM) models to discover latent representations for a source code's hierarchical structure. We demonstrate that our proposed architecture enables feed-forward predictive models to predict change in performance using source code with up to 84% accuracy.
期刊介绍:
Concurrency and Computation: Practice and Experience (CCPE) publishes high-quality, original research papers, and authoritative research review papers, in the overlapping fields of:
Parallel and distributed computing;
High-performance computing;
Computational and data science;
Artificial intelligence and machine learning;
Big data applications, algorithms, and systems;
Network science;
Ontologies and semantics;
Security and privacy;
Cloud/edge/fog computing;
Green computing; and
Quantum computing.