Disengaged scheduling for fair, protected access to fast computational accelerators

Proceedings of the 19th international conference on Architectural support for programming languages and operating systems Pub Date : 2014-02-24 DOI:10.1145/2541940.2541963

Konstantinos Menychtas, Kai Shen, M. Scott

{"title":"Disengaged scheduling for fair, protected access to fast computational accelerators","authors":"Konstantinos Menychtas, Kai Shen, M. Scott","doi":"10.1145/2541940.2541963","DOIUrl":null,"url":null,"abstract":"Today's operating systems treat GPUs and other computational accelerators as if they were simple devices, with bounded and predictable response times. With accelerators assuming an increasing share of the workload on modern machines, this strategy is already problematic, and likely to become untenable soon. If the operating system is to enforce fair sharing of the machine, it must assume responsibility for accelerator scheduling and resource management. Fair, safe scheduling is a particular challenge on fast accelerators, which allow applications to avoid kernel-crossing overhead by interacting directly with the device. We propose a disengaged scheduling strategy in which the kernel intercedes between applications and the accelerator on an infrequent basis, to monitor their use of accelerator cycles and to determine which applications should be granted access over the next time interval. Our strategy assumes a well defined, narrow interface exported by the accelerator. We build upon such an interface, systematically inferred for the latest Nvidia GPUs. We construct several example schedulers, including Disengaged Timeslice with overuse control that guarantees fairness and Disengaged Fair Queueing that is effective in limiting resource idleness, but probabilistic. Both schedulers ensure fair sharing of the GPU, even among uncooperative or adversarial applications; Disengaged Fair Queueing incurs a 4% overhead on average (max 18%) compared to direct device access across our evaluation scenarios.","PeriodicalId":128805,"journal":{"name":"Proceedings of the 19th international conference on Architectural support for programming languages and operating systems","volume":"52 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"63","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 19th international conference on Architectural support for programming languages and operating systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2541940.2541963","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 63

Abstract

Today's operating systems treat GPUs and other computational accelerators as if they were simple devices, with bounded and predictable response times. With accelerators assuming an increasing share of the workload on modern machines, this strategy is already problematic, and likely to become untenable soon. If the operating system is to enforce fair sharing of the machine, it must assume responsibility for accelerator scheduling and resource management. Fair, safe scheduling is a particular challenge on fast accelerators, which allow applications to avoid kernel-crossing overhead by interacting directly with the device. We propose a disengaged scheduling strategy in which the kernel intercedes between applications and the accelerator on an infrequent basis, to monitor their use of accelerator cycles and to determine which applications should be granted access over the next time interval. Our strategy assumes a well defined, narrow interface exported by the accelerator. We build upon such an interface, systematically inferred for the latest Nvidia GPUs. We construct several example schedulers, including Disengaged Timeslice with overuse control that guarantees fairness and Disengaged Fair Queueing that is effective in limiting resource idleness, but probabilistic. Both schedulers ensure fair sharing of the GPU, even among uncooperative or adversarial applications; Disengaged Fair Queueing incurs a 4% overhead on average (max 18%) compared to direct device access across our evaluation scenarios.

查看原文本刊更多论文

为公平、受保护地访问快速计算加速器而进行的非参与调度

今天的操作系统将gpu和其他计算加速器视为简单的设备，具有有限和可预测的响应时间。随着加速器在现代机器上承担越来越多的工作负载份额，这种策略已经存在问题，并且可能很快就会变得站不住脚。如果操作系统要强制公平地共享机器，它必须承担加速器调度和资源管理的责任。对于快速加速器来说，公平、安全的调度是一个特别的挑战，它允许应用程序通过直接与设备交互来避免内核交叉开销。我们提出了一种非占用调度策略，内核不频繁地在应用程序和加速器之间进行交互，以监视它们对加速器周期的使用，并确定应该在下一个时间间隔内授予哪些应用程序访问权。我们的策略假设加速器输出一个定义良好的窄接口。我们建立在这样一个接口，系统地推断为最新的Nvidia gpu。我们构建了几个示例调度器，包括具有保证公平性的过度使用控制的Disengaged Timeslice和在限制资源空闲方面有效的Disengaged Fair Queueing，但这是概率性的。两个调度器确保公平共享GPU，即使在不合作或敌对的应用程序之间;在我们的评估场景中，与直接设备访问相比，Disengaged Fair queues平均会产生4%的开销(最大18%)。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 19th international conference on Architectural support for programming languages and operating systems

自引率

0.00%

发文量