A multilevel algorithm for scalable independent task assignment

IF 6.2 2区计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS

Future Generation Computer Systems-The International Journal of Escience Pub Date : 2025-10-08 DOI:10.1016/j.future.2025.108183

H. Burhan Tabak, E. Kartal Tabak, Cevdet Aykanat

{"title":"A multilevel algorithm for scalable independent task assignment","authors":"H. Burhan Tabak, E. Kartal Tabak, Cevdet Aykanat","doi":"10.1016/j.future.2025.108183","DOIUrl":null,"url":null,"abstract":"<div><div>Assigning a large number of independent tasks to heterogeneous processors is a fundamental problem in modern computing, with applications in many domains such as cloud services, web crawling, and AI training. Exact and matheuristic approaches deliver high-quality assignments but incur superlinear or even exponential runtime costs, making them impractical, especially on large problem instances. Conversely, lightweight heuristics run efficiently at scale but often produce assignments with much lower quality. To address this issue, we present the first multilevel framework for the independent task assignment problem that maintains an end-to-end linear runtime bound of <span><math><mrow><mi>O</mi><mo>(</mo><mi>K</mi><mi>N</mi><mo>)</mo></mrow></math></span>, where <span><math><mrow><mi>K</mi><mspace></mspace><mo>×</mo><mspace></mspace><mi>N</mi></mrow></math></span> is the size of the expected-time-to-compute matrix, with <span><math><mi>K</mi></math></span> and <span><math><mi>N</mi></math></span> respectively representing the number of processors and tasks. We propose (i) novel high-quality coarsening metrics that numerically define task characteristics and similarity; (ii) an efficient and effective matching algorithm that incorporates these metrics while maintaining linear time complexity with respect to the input size; (iii) an initial solution scheme that generates base solutions using complementary heuristics, which are disjointly projected back through the uncoarsening levels; (iv) an effective and efficient uncoarsening algorithm that iteratively improves assignment quality with different refinement algorithms. Extensive experimental evaluations involving hundreds of millions of tasks demonstrate that our algorithm achieves significantly higher quality and runs faster than known high-quality heuristics, making it a practical choice for the problem instances at high scale.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"176 ","pages":"Article 108183"},"PeriodicalIF":6.2000,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Future Generation Computer Systems-The International Journal of Escience","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167739X25004777","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}

引用次数: 0

Abstract

Assigning a large number of independent tasks to heterogeneous processors is a fundamental problem in modern computing, with applications in many domains such as cloud services, web crawling, and AI training. Exact and matheuristic approaches deliver high-quality assignments but incur superlinear or even exponential runtime costs, making them impractical, especially on large problem instances. Conversely, lightweight heuristics run efficiently at scale but often produce assignments with much lower quality. To address this issue, we present the first multilevel framework for the independent task assignment problem that maintains an end-to-end linear runtime bound of

O (K N)

, where

K \times N

is the size of the expected-time-to-compute matrix, with

K

and

N

respectively representing the number of processors and tasks. We propose (i) novel high-quality coarsening metrics that numerically define task characteristics and similarity; (ii) an efficient and effective matching algorithm that incorporates these metrics while maintaining linear time complexity with respect to the input size; (iii) an initial solution scheme that generates base solutions using complementary heuristics, which are disjointly projected back through the uncoarsening levels; (iv) an effective and efficient uncoarsening algorithm that iteratively improves assignment quality with different refinement algorithms. Extensive experimental evaluations involving hundreds of millions of tasks demonstrate that our algorithm achieves significantly higher quality and runs faster than known high-quality heuristics, making it a practical choice for the problem instances at high scale.

查看原文本刊更多论文

一种可扩展独立任务分配的多级算法

将大量独立任务分配给异构处理器是现代计算中的一个基本问题，在云服务、网络爬行和人工智能训练等许多领域都有应用。精确和数学化的方法提供了高质量的任务，但会产生超线性甚至指数级的运行时间成本，使它们变得不切实际，特别是在大型问题实例上。相反，轻量级启发式在规模上运行得很有效，但通常产生的任务质量要低得多。为了解决这个问题，我们提出了独立任务分配问题的第一个多层框架，该框架保持了端到端的线性运行时边界为0 (KN)，其中K×N是预期计算时间矩阵的大小，K和N分别表示处理器和任务的数量。我们提出(i)新的高质量粗化指标，以数字方式定义任务特征和相似性；（ii）一种高效的匹配算法，在保持与输入大小相关的线性时间复杂度的同时，纳入这些指标；（iii）初始解决方案，该方案使用互补启发式生成基本解决方案，这些方案通过非粗化水平进行离散投影；（iv）一种有效且高效的非粗化算法，通过不同的细化算法迭代提高分配质量。涉及数亿个任务的广泛实验评估表明，我们的算法比已知的高质量启发式算法实现了更高的质量和运行速度，使其成为大规模问题实例的实用选择。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Future Generation Computer Systems-The International Journal of Escience 工程技术-计算机：理论方法

CiteScore

19.90

自引率

2.70%

发文量

376

审稿时长

10.6 months

期刊介绍： Computing infrastructures and systems are constantly evolving, resulting in increasingly complex and collaborative scientific applications. To cope with these advancements, there is a growing need for collaborative tools that can effectively map, control, and execute these applications. Furthermore, with the explosion of Big Data, there is a requirement for innovative methods and infrastructures to collect, analyze, and derive meaningful insights from the vast amount of data generated. This necessitates the integration of computational and storage capabilities, databases, sensors, and human collaboration. Future Generation Computer Systems aims to pioneer advancements in distributed systems, collaborative environments, high-performance computing, and Big Data analytics. It strives to stay at the forefront of developments in grids, clouds, and the Internet of Things (IoT) to effectively address the challenges posed by these wide-area, fully distributed sensing and computing systems.