AS2: Adaptive sorting algorithm selection for heterogeneous workloads and systems

IF 6.2 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS
Sangmyung Lee , Byungyoon Lee , Yongseok Son , Kiwook Sohn , Hwajung Kim , Sunggon Kim
{"title":"AS2: Adaptive sorting algorithm selection for heterogeneous workloads and systems","authors":"Sangmyung Lee ,&nbsp;Byungyoon Lee ,&nbsp;Yongseok Son ,&nbsp;Kiwook Sohn ,&nbsp;Hwajung Kim ,&nbsp;Sunggon Kim","doi":"10.1016/j.future.2025.107860","DOIUrl":null,"url":null,"abstract":"<div><div>Sorting is becoming increasingly important in modern computing, ranging from small-scale Internet of Things (IoT) devices to supercomputers. To improve sorting performance, various algorithms, including Intro sort, Merge sort, Heap sort, and Insertion sort, are adopted in different systems. However, the performance of sorting algorithms depends on various factors, and our analysis shows that the optimal algorithm varies, with no single algorithm consistently outperforming the others. In this paper, we first analyze data internal factors (data size, distribution, data type) and external factors (threads, different hardware) that impact sorting algorithm performance. We utilize widely adopted sorting algorithms such as STL sort and Merge sort, as well as state-of-the-art sorting algorithms like Ips4o sort and Aips2o sort. In addition to sequential sorting algorithms, we implement Parallel Intro sort and utilize the parallel versions of state-of-the-art sorting algorithms with varying number of threads. From the analysis, we present an adaptive sorting algorithm selection model for heterogeneous workloads and systems, called AS2 (Adaptive Sorting Algorithm Selection). Its goal is to determine the optimal algorithm from the existing sorting algorithms in heterogeneous workloads and systems. AS2 uses various ML models to build performance models for each sorting algorithm using data internal and external factors from various datasets. Then, AS2 chooses the optimal sorting algorithm based on the performance prediction using the model. We evaluate AS2 using a representative dataset that includes various data internal and external factors. The results show that AS2 can accurately predict the performance of various sorting algorithms, with min and max r-squared values of 0.83 and 0.99, respectively. In addition, AS2 successfully selects the optimal algorithm in our evaluation scenario up to 99.68% accuracy by choosing the algorithm with the shortest predicted sorting time, improving performance by up to 1.83<span><math><mo>×</mo></math></span> compared to the state-of-the-art algorithm. We also evaluate the performance of AS2 using the real-world dataset and the results show that AS2 selects the optimal algorithm with 87.50% accuracy.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"172 ","pages":"Article 107860"},"PeriodicalIF":6.2000,"publicationDate":"2025-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Future Generation Computer Systems-The International Journal of Escience","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167739X25001554","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

Sorting is becoming increasingly important in modern computing, ranging from small-scale Internet of Things (IoT) devices to supercomputers. To improve sorting performance, various algorithms, including Intro sort, Merge sort, Heap sort, and Insertion sort, are adopted in different systems. However, the performance of sorting algorithms depends on various factors, and our analysis shows that the optimal algorithm varies, with no single algorithm consistently outperforming the others. In this paper, we first analyze data internal factors (data size, distribution, data type) and external factors (threads, different hardware) that impact sorting algorithm performance. We utilize widely adopted sorting algorithms such as STL sort and Merge sort, as well as state-of-the-art sorting algorithms like Ips4o sort and Aips2o sort. In addition to sequential sorting algorithms, we implement Parallel Intro sort and utilize the parallel versions of state-of-the-art sorting algorithms with varying number of threads. From the analysis, we present an adaptive sorting algorithm selection model for heterogeneous workloads and systems, called AS2 (Adaptive Sorting Algorithm Selection). Its goal is to determine the optimal algorithm from the existing sorting algorithms in heterogeneous workloads and systems. AS2 uses various ML models to build performance models for each sorting algorithm using data internal and external factors from various datasets. Then, AS2 chooses the optimal sorting algorithm based on the performance prediction using the model. We evaluate AS2 using a representative dataset that includes various data internal and external factors. The results show that AS2 can accurately predict the performance of various sorting algorithms, with min and max r-squared values of 0.83 and 0.99, respectively. In addition, AS2 successfully selects the optimal algorithm in our evaluation scenario up to 99.68% accuracy by choosing the algorithm with the shortest predicted sorting time, improving performance by up to 1.83× compared to the state-of-the-art algorithm. We also evaluate the performance of AS2 using the real-world dataset and the results show that AS2 selects the optimal algorithm with 87.50% accuracy.
AS2:针对异构工作负载和系统的自适应排序算法选择
从小型物联网(IoT)设备到超级计算机,分类在现代计算中变得越来越重要。为了提高排序性能,在不同的系统中采用了Intro sort、Merge sort、Heap sort、insert sort等算法。然而,排序算法的性能取决于各种因素,我们的分析表明,最优算法是不同的,没有一个算法总是优于其他算法。在本文中,我们首先分析了影响排序算法性能的数据内部因素(数据大小、分布、数据类型)和外部因素(线程、不同硬件)。我们使用广泛采用的排序算法,如STL排序和Merge排序,以及最先进的排序算法,如ips40排序和aips20排序。除了顺序排序算法之外,我们还实现了并行Intro排序,并利用具有不同线程数的最先进排序算法的并行版本。通过分析,我们提出了一种针对异构工作负载和系统的自适应排序算法选择模型,称为AS2(自适应排序算法选择)。其目标是从异构工作负载和系统中的现有排序算法中确定最优算法。AS2使用各种ML模型,使用来自各种数据集的数据内部和外部因素为每个排序算法构建性能模型。然后,AS2根据模型的性能预测选择最优排序算法。我们使用包含各种内部和外部因素的代表性数据集来评估AS2。结果表明,AS2可以准确预测各种排序算法的性能,最小和最大r平方值分别为0.83和0.99。此外,AS2通过选择预测排序时间最短的算法,成功选择了我们评估场景中的最优算法,准确率高达99.68%,性能比目前最先进的算法提高了1.83倍。我们还使用实际数据集评估了AS2的性能,结果表明AS2选择最优算法的准确率为87.50%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
19.90
自引率
2.70%
发文量
376
审稿时长
10.6 months
期刊介绍: Computing infrastructures and systems are constantly evolving, resulting in increasingly complex and collaborative scientific applications. To cope with these advancements, there is a growing need for collaborative tools that can effectively map, control, and execute these applications. Furthermore, with the explosion of Big Data, there is a requirement for innovative methods and infrastructures to collect, analyze, and derive meaningful insights from the vast amount of data generated. This necessitates the integration of computational and storage capabilities, databases, sensors, and human collaboration. Future Generation Computer Systems aims to pioneer advancements in distributed systems, collaborative environments, high-performance computing, and Big Data analytics. It strives to stay at the forefront of developments in grids, clouds, and the Internet of Things (IoT) to effectively address the challenges posed by these wide-area, fully distributed sensing and computing systems.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信