DualRT: A Qos-Aware Soft Real-Time Video Analytics Framework for Dual-Stage GPU-CPU Tasks on Edge

IF 1.5 4区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING
Changhong Zhu, Haitao Zhang, Xingtao Xu
{"title":"DualRT: A Qos-Aware Soft Real-Time Video Analytics Framework for Dual-Stage GPU-CPU Tasks on Edge","authors":"Changhong Zhu,&nbsp;Haitao Zhang,&nbsp;Xingtao Xu","doi":"10.1002/cpe.70174","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>Edge cameras are ubiquitous, together with the recent boom in computer vision technology, and a variety of video analytics tasks are being processed at the edge. It is challenging to support more complex video analytics tasks on edge servers with unpredictable request loads and limited resources. However, most of these works use only a single optimization approach, focusing only on the improvement of a certain performance metric in a single processing stage, ignoring the balance of other performance metrics, and the space available for optimization is often very limited. Especially when dealing with video analytics tasks that need to be divided into two GPU-CPU stages for completion, this unidirectional focus may lead to execution performance imbalance or even negative quality of service (QoS) optimization. In addition, to fully utilize the valuable resources on the edge servers, it is often necessary to schedule multiple types of video analytics tasks on the edge servers. However, most of the existing scheduling strategies only focus on how to allocate computational resources for end-to-end tasks. They lack the awareness and consideration of the execution of tasks in different execution stages, as well as the mutual interference among tasks. These scheduling strategies, lacking stage-sensitivity and interference-sensitivity, may cause performance conflicts in environments running multiple tasks involving GPU-CPU dual-stage processing, thus affecting the overall QoS. To address these challenges, we first evaluate the impact of batch processing, frame rate control, resolution selection, and CPU concurrency processing on throughput, latency, and accuracy when running dual-stage tasks on edge platforms. Then, we propose DualRT, a soft real-time video analytics framework for dual-stage tasks, to optimize the QoS of dual-stage tasks while avoiding request stacking on edge platforms. In the scheduling module of DualRT, we design a scheduling method using a multi-agent deep reinforcement learning algorithm and a variable time window approach to schedule multiple dual-stage tasks with joint control of batch size, resolution, frame rate, and CPU concurrency for each task. Our experimental results show that DualRT improves QoS by an average of 13.3% and maximum throughput by an average of 24.6% compared to state-of-the-art solutions.</p>\n </div>","PeriodicalId":55214,"journal":{"name":"Concurrency and Computation-Practice & Experience","volume":"37 18-20","pages":""},"PeriodicalIF":1.5000,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Concurrency and Computation-Practice & Experience","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/cpe.70174","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0

Abstract

Edge cameras are ubiquitous, together with the recent boom in computer vision technology, and a variety of video analytics tasks are being processed at the edge. It is challenging to support more complex video analytics tasks on edge servers with unpredictable request loads and limited resources. However, most of these works use only a single optimization approach, focusing only on the improvement of a certain performance metric in a single processing stage, ignoring the balance of other performance metrics, and the space available for optimization is often very limited. Especially when dealing with video analytics tasks that need to be divided into two GPU-CPU stages for completion, this unidirectional focus may lead to execution performance imbalance or even negative quality of service (QoS) optimization. In addition, to fully utilize the valuable resources on the edge servers, it is often necessary to schedule multiple types of video analytics tasks on the edge servers. However, most of the existing scheduling strategies only focus on how to allocate computational resources for end-to-end tasks. They lack the awareness and consideration of the execution of tasks in different execution stages, as well as the mutual interference among tasks. These scheduling strategies, lacking stage-sensitivity and interference-sensitivity, may cause performance conflicts in environments running multiple tasks involving GPU-CPU dual-stage processing, thus affecting the overall QoS. To address these challenges, we first evaluate the impact of batch processing, frame rate control, resolution selection, and CPU concurrency processing on throughput, latency, and accuracy when running dual-stage tasks on edge platforms. Then, we propose DualRT, a soft real-time video analytics framework for dual-stage tasks, to optimize the QoS of dual-stage tasks while avoiding request stacking on edge platforms. In the scheduling module of DualRT, we design a scheduling method using a multi-agent deep reinforcement learning algorithm and a variable time window approach to schedule multiple dual-stage tasks with joint control of batch size, resolution, frame rate, and CPU concurrency for each task. Our experimental results show that DualRT improves QoS by an average of 13.3% and maximum throughput by an average of 24.6% compared to state-of-the-art solutions.

DualRT:一种面向边缘双阶段GPU-CPU任务的qos感知软实时视频分析框架
边缘摄像头无处不在,加上最近计算机视觉技术的蓬勃发展,各种视频分析任务正在边缘处理。在具有不可预测的请求负载和有限资源的边缘服务器上支持更复杂的视频分析任务是具有挑战性的。然而,这些工作大多只使用单一的优化方法,只关注单个处理阶段中某个性能指标的改进,而忽略了其他性能指标的平衡,可用于优化的空间往往非常有限。特别是在处理需要分为两个GPU-CPU阶段完成的视频分析任务时,这种单向关注可能导致执行性能不平衡甚至负面的服务质量(QoS)优化。此外,为了充分利用边缘服务器上的宝贵资源,通常需要在边缘服务器上调度多种类型的视频分析任务。然而,现有的调度策略大多只关注如何为端到端任务分配计算资源。缺乏对不同执行阶段任务执行的认识和考虑,以及任务之间的相互干扰。这些调度策略缺乏阶段敏感性和干扰敏感性,在GPU-CPU双阶段处理的多任务环境下可能会导致性能冲突,从而影响整体QoS。为了应对这些挑战,我们首先评估了批处理、帧速率控制、分辨率选择和CPU并发处理在边缘平台上运行双阶段任务时对吞吐量、延迟和准确性的影响。然后,我们提出了双阶段任务软实时视频分析框架DualRT,以优化双阶段任务的QoS,同时避免边缘平台上的请求堆叠。在DualRT的调度模块中,我们设计了一种使用多智能体深度强化学习算法和变时间窗方法来调度多个双阶段任务的调度方法,并对每个任务的批处理大小、分辨率、帧率和CPU并发性进行联合控制。我们的实验结果表明,与最先进的解决方案相比,DualRT将QoS平均提高了13.3%,最大吞吐量平均提高了24.6%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Concurrency and Computation-Practice & Experience
Concurrency and Computation-Practice & Experience 工程技术-计算机:理论方法
CiteScore
5.00
自引率
10.00%
发文量
664
审稿时长
9.6 months
期刊介绍: Concurrency and Computation: Practice and Experience (CCPE) publishes high-quality, original research papers, and authoritative research review papers, in the overlapping fields of: Parallel and distributed computing; High-performance computing; Computational and data science; Artificial intelligence and machine learning; Big data applications, algorithms, and systems; Network science; Ontologies and semantics; Security and privacy; Cloud/edge/fog computing; Green computing; and Quantum computing.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信