Deadline-Based Scheduling for GPU with Preemption Support

Nicola Capodieci, R. Cavicchioli, M. Bertogna, Aingara Paramakuru
{"title":"Deadline-Based Scheduling for GPU with Preemption Support","authors":"Nicola Capodieci, R. Cavicchioli, M. Bertogna, Aingara Paramakuru","doi":"10.1109/RTSS.2018.00021","DOIUrl":null,"url":null,"abstract":"Modern automotive-grade embedded computing platforms feature high-performance Graphics Processing Units (GPUs) to support the massively parallel processing power needed for next-generation autonomous driving applications (e.g., Deep Neural Network (DNN) inference, sensor fusion, path planning, etc). As these workload-intensive activities are pushed to higher criticality levels, there is a stronger need for more predictable scheduling algorithms that are able to guarantee predictability without overly sacrificing GPU utilization. Unfortunately, the real-rime literature on GPU scheduling mostly considered limited (or null) preemption capabilities, while previous efforts in broader domains were often based on programming models and APIs that were not designed to support the real-rime requirements of recurring workloads. In this paper, we present the design of a prototype real-time scheduler for GPU activities on an embedded System on a Chip (SoC) featuring a cutting edge GPU architecture by NVIDIA adopted in the autonomous driving domain. The scheduler runs as a software partition on top of the NVIDIA hypervisor, and it leverages latest generation architectural features, such as pixel-level preemption and threadlevel preemption. Such a design allowed us to implement and test a preemptive Earliest Deadline First (EDF) scheduler for GPU tasks providing bandwidth isolations by means of a Constant Bandwidth Server (CBS). Our work involved investigating alternative programming models for compute APIs, allowing us to characterize CPU-to-GPU command submission with more detailed scheduling information. A detailed experimental characterization is presented to show the significant schedulability improvement of recurring real-time GPU tasks.","PeriodicalId":294784,"journal":{"name":"2018 IEEE Real-Time Systems Symposium (RTSS)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"59","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE Real-Time Systems Symposium (RTSS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/RTSS.2018.00021","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 59

Abstract

Modern automotive-grade embedded computing platforms feature high-performance Graphics Processing Units (GPUs) to support the massively parallel processing power needed for next-generation autonomous driving applications (e.g., Deep Neural Network (DNN) inference, sensor fusion, path planning, etc). As these workload-intensive activities are pushed to higher criticality levels, there is a stronger need for more predictable scheduling algorithms that are able to guarantee predictability without overly sacrificing GPU utilization. Unfortunately, the real-rime literature on GPU scheduling mostly considered limited (or null) preemption capabilities, while previous efforts in broader domains were often based on programming models and APIs that were not designed to support the real-rime requirements of recurring workloads. In this paper, we present the design of a prototype real-time scheduler for GPU activities on an embedded System on a Chip (SoC) featuring a cutting edge GPU architecture by NVIDIA adopted in the autonomous driving domain. The scheduler runs as a software partition on top of the NVIDIA hypervisor, and it leverages latest generation architectural features, such as pixel-level preemption and threadlevel preemption. Such a design allowed us to implement and test a preemptive Earliest Deadline First (EDF) scheduler for GPU tasks providing bandwidth isolations by means of a Constant Bandwidth Server (CBS). Our work involved investigating alternative programming models for compute APIs, allowing us to characterize CPU-to-GPU command submission with more detailed scheduling information. A detailed experimental characterization is presented to show the significant schedulability improvement of recurring real-time GPU tasks.
基于截止日期的GPU调度与抢占支持
现代汽车级嵌入式计算平台具有高性能图形处理单元(gpu),以支持下一代自动驾驶应用所需的大规模并行处理能力(例如,深度神经网络(DNN)推理、传感器融合、路径规划等)。由于这些工作负载密集型活动被推到更高的临界级别,因此更需要更具可预测性的调度算法,这些算法能够在不过度牺牲GPU利用率的情况下保证可预测性。不幸的是,关于GPU调度的实时文献大多认为有限(或零)抢占能力,而之前在更广泛领域的努力通常基于编程模型和api,而这些模型和api的设计并不是为了支持重复工作负载的实时需求。在本文中,我们提出了一个基于NVIDIA在自动驾驶领域采用的尖端GPU架构的嵌入式片上系统(SoC)上GPU活动的原型实时调度程序的设计。调度器作为一个软件分区运行在NVIDIA管理程序之上,它利用了最新一代的体系结构特性,比如像素级抢占和线程级抢占。这样的设计允许我们实现和测试一个抢占式的最早截止日期优先(EDF)调度程序,用于通过恒定带宽服务器(CBS)提供带宽隔离的GPU任务。我们的工作涉及研究计算api的替代编程模型,允许我们使用更详细的调度信息来描述cpu到gpu的命令提交。详细的实验表征表明,循环实时GPU任务的可调度性显著提高。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信