Multitasking Real-time Embedded GPU Computing Tasks

Proceedings of the 7th International Workshop on Programming Models and Applications for Multicores and Manycores Pub Date : 2016-03-12 DOI:10.1145/2883404.2883408

Pınar Muyan-Özçelik, John Douglas Owens

{"title":"Multitasking Real-time Embedded GPU Computing Tasks","authors":"Pınar Muyan-Özçelik, John Douglas Owens","doi":"10.1145/2883404.2883408","DOIUrl":null,"url":null,"abstract":"In this study, we consider the specific characteristics of workloads that involve multiple real-time embedded GPU computing tasks and design several schedulers that use alternative approaches. Then, we compare the performance of schedulers and determine which scheduling approach is more effective for a given workload and why. The major conclusions of this study include: (a) Small kernels benefit from running kernels concurrently. (b) The combination of small kernels, high-priority kernels with longer runtimes, and lower-priority kernels with shorter runtimes benefits from a CPU scheduler that dynamically changes kernel order on the Fermi architecture. (c) Due to limitations of existing GPU architectures, currently CPU schedulers outperform their GPU counterparts. We also highlight the shortcomings of current GPU architectures with regard to running multiple real-time tasks, and recommend new features that would improve scheduling, including hardware priorities, preemption, programmable scheduling, and a common time concept and atomics across the CPU and GPU.","PeriodicalId":185841,"journal":{"name":"Proceedings of the 7th International Workshop on Programming Models and Applications for Multicores and Manycores","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 7th International Workshop on Programming Models and Applications for Multicores and Manycores","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2883404.2883408","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

Abstract

In this study, we consider the specific characteristics of workloads that involve multiple real-time embedded GPU computing tasks and design several schedulers that use alternative approaches. Then, we compare the performance of schedulers and determine which scheduling approach is more effective for a given workload and why. The major conclusions of this study include: (a) Small kernels benefit from running kernels concurrently. (b) The combination of small kernels, high-priority kernels with longer runtimes, and lower-priority kernels with shorter runtimes benefits from a CPU scheduler that dynamically changes kernel order on the Fermi architecture. (c) Due to limitations of existing GPU architectures, currently CPU schedulers outperform their GPU counterparts. We also highlight the shortcomings of current GPU architectures with regard to running multiple real-time tasks, and recommend new features that would improve scheduling, including hardware priorities, preemption, programmable scheduling, and a common time concept and atomics across the CPU and GPU.

查看原文本刊更多论文

多任务实时嵌入式GPU计算任务

在本研究中，我们考虑了涉及多个实时嵌入式GPU计算任务的工作负载的特定特征，并设计了几个使用替代方法的调度器。然后，我们比较调度器的性能，并确定哪种调度方法对于给定的工作负载更有效，以及为什么。本研究的主要结论包括:(a)小内核从并行运行内核中受益。(b)小内核、高优先级长运行时间内核和低优先级短运行时间内核的组合受益于CPU调度器，该调度器在费米体系结构上动态改变内核顺序。(c)由于现有GPU架构的限制，目前CPU调度器的性能优于GPU调度器。我们还强调了当前GPU架构在运行多个实时任务方面的缺点，并推荐了可以改善调度的新功能，包括硬件优先级，抢占，可编程调度，以及跨CPU和GPU的公共时间概念和原子。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 7th International Workshop on Programming Models and Applications for Multicores and Manycores

自引率

0.00%

发文量