Workstation capacity tuning using reinforcement learning

Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07) Pub Date : 2007-11-16 DOI:10.1145/1362622.1362666

Aharon Bar-Hillel, Amir Di-Nur, L. Ein-Dor, Ran Gilad-Bachrach, Yossi Ittach

引用次数: 13

Abstract

Computer grids are complex, heterogeneous, and dynamic systems, whose behavior is governed by hundreds of manually-tuned parameters. As the complexity of these systems grows, automating the procedure of parameter tuning becomes indispensable. In this paper, we consider the problem of auto-tuning server capacity, i.e. the number of jobs a server runs in parallel. We present three different reinforcement learning algorithms, which generate a dynamic policy by changing the number of concurrent running jobs according to the job types and machine state. The algorithms outperform manually-tuned policies for the entire range of checked workloads, with average throughput improvement greater than 20%. On multi-core servers, the average throughput improvement is approximately 40%, which hints at the enormous improvement potential of such a tuning mechanism with the gradual transition to multi-core machines.

查看原文本刊更多论文

使用强化学习的工作站容量调整

计算机网格是复杂的、异构的和动态的系统，其行为由数百个手动调整的参数控制。随着这些系统复杂性的增长，参数整定过程的自动化变得必不可少。在本文中，我们考虑了自动调优服务器容量的问题，即服务器并行运行的作业数量。我们提出了三种不同的强化学习算法，它们通过根据作业类型和机器状态改变并发运行作业的数量来生成动态策略。在检查的整个工作负载范围内，算法的性能优于手动调优策略，平均吞吐量提高超过20%。在多核服务器上，平均吞吐量提高了大约40%，这表明随着向多核机器的逐步过渡，这种调优机制具有巨大的改进潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07)

自引率

0.00%

发文量