Michael McKinsey , Dewi Yokelson , Stephanie Brink , Tom Scogland , Olga Pearce
{"title":"Parallel sorting algorithm classification: is manual instrumentation necessary?","authors":"Michael McKinsey , Dewi Yokelson , Stephanie Brink , Tom Scogland , Olga Pearce","doi":"10.1016/j.future.2025.108170","DOIUrl":null,"url":null,"abstract":"<div><div>Understanding parallel algorithms is crucial for accelerating scientific simulations on complex, distributed memory, high-performance computers. Modern algorithm classification approaches learn semantics directly from source code to differentiate between algorithms, however, accessing source code is not always possible. We can learn about parallel algorithms from observing their performance, as programs running the same algorithms and using the same hardware should exhibit similar performance characteristics. We present an approach to learn algorithm classes from parallel performance data directly in order to classify algorithms without access to the source code. We extend previous work to enable classifying parallel sorting algorithms using automatic instrumentation instead of requiring manual region annotations in the source code. In this work, we design and demonstrate a study for classification of parallel sorting algorithms using parallel performance data collected from automatic instrumentation, and evaluate the performance of our new methodology on classification. We leverage Caliper to collect the performance data, Thicket for our exploratory data analysis (EDA), and PyTorch and Scikit-learn to evaluate the effectiveness of random forests, support vector machines (SVMs), decision trees, neural networks, and logistic regressions on parallel performance data. Additionally, we study noise in parallel performance data, whether the removal of noise and pre-processing of the data is necessary to accurately classify parallel sorting algorithms, and determine the effectiveness of features created from performance data. We demonstrate classification accuracy for these five different models of up to 97.7% across four different parallel algorithm classes.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"176 ","pages":"Article 108170"},"PeriodicalIF":6.2000,"publicationDate":"2025-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Future Generation Computer Systems-The International Journal of Escience","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167739X25004649","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Understanding parallel algorithms is crucial for accelerating scientific simulations on complex, distributed memory, high-performance computers. Modern algorithm classification approaches learn semantics directly from source code to differentiate between algorithms, however, accessing source code is not always possible. We can learn about parallel algorithms from observing their performance, as programs running the same algorithms and using the same hardware should exhibit similar performance characteristics. We present an approach to learn algorithm classes from parallel performance data directly in order to classify algorithms without access to the source code. We extend previous work to enable classifying parallel sorting algorithms using automatic instrumentation instead of requiring manual region annotations in the source code. In this work, we design and demonstrate a study for classification of parallel sorting algorithms using parallel performance data collected from automatic instrumentation, and evaluate the performance of our new methodology on classification. We leverage Caliper to collect the performance data, Thicket for our exploratory data analysis (EDA), and PyTorch and Scikit-learn to evaluate the effectiveness of random forests, support vector machines (SVMs), decision trees, neural networks, and logistic regressions on parallel performance data. Additionally, we study noise in parallel performance data, whether the removal of noise and pre-processing of the data is necessary to accurately classify parallel sorting algorithms, and determine the effectiveness of features created from performance data. We demonstrate classification accuracy for these five different models of up to 97.7% across four different parallel algorithm classes.
期刊介绍:
Computing infrastructures and systems are constantly evolving, resulting in increasingly complex and collaborative scientific applications. To cope with these advancements, there is a growing need for collaborative tools that can effectively map, control, and execute these applications.
Furthermore, with the explosion of Big Data, there is a requirement for innovative methods and infrastructures to collect, analyze, and derive meaningful insights from the vast amount of data generated. This necessitates the integration of computational and storage capabilities, databases, sensors, and human collaboration.
Future Generation Computer Systems aims to pioneer advancements in distributed systems, collaborative environments, high-performance computing, and Big Data analytics. It strives to stay at the forefront of developments in grids, clouds, and the Internet of Things (IoT) to effectively address the challenges posed by these wide-area, fully distributed sensing and computing systems.