{"title":"DualRT: A Qos-Aware Soft Real-Time Video Analytics Framework for Dual-Stage GPU-CPU Tasks on Edge","authors":"Changhong Zhu, Haitao Zhang, Xingtao Xu","doi":"10.1002/cpe.70174","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>Edge cameras are ubiquitous, together with the recent boom in computer vision technology, and a variety of video analytics tasks are being processed at the edge. It is challenging to support more complex video analytics tasks on edge servers with unpredictable request loads and limited resources. However, most of these works use only a single optimization approach, focusing only on the improvement of a certain performance metric in a single processing stage, ignoring the balance of other performance metrics, and the space available for optimization is often very limited. Especially when dealing with video analytics tasks that need to be divided into two GPU-CPU stages for completion, this unidirectional focus may lead to execution performance imbalance or even negative quality of service (QoS) optimization. In addition, to fully utilize the valuable resources on the edge servers, it is often necessary to schedule multiple types of video analytics tasks on the edge servers. However, most of the existing scheduling strategies only focus on how to allocate computational resources for end-to-end tasks. They lack the awareness and consideration of the execution of tasks in different execution stages, as well as the mutual interference among tasks. These scheduling strategies, lacking stage-sensitivity and interference-sensitivity, may cause performance conflicts in environments running multiple tasks involving GPU-CPU dual-stage processing, thus affecting the overall QoS. To address these challenges, we first evaluate the impact of batch processing, frame rate control, resolution selection, and CPU concurrency processing on throughput, latency, and accuracy when running dual-stage tasks on edge platforms. Then, we propose DualRT, a soft real-time video analytics framework for dual-stage tasks, to optimize the QoS of dual-stage tasks while avoiding request stacking on edge platforms. In the scheduling module of DualRT, we design a scheduling method using a multi-agent deep reinforcement learning algorithm and a variable time window approach to schedule multiple dual-stage tasks with joint control of batch size, resolution, frame rate, and CPU concurrency for each task. Our experimental results show that DualRT improves QoS by an average of 13.3% and maximum throughput by an average of 24.6% compared to state-of-the-art solutions.</p>\n </div>","PeriodicalId":55214,"journal":{"name":"Concurrency and Computation-Practice & Experience","volume":"37 18-20","pages":""},"PeriodicalIF":1.5000,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Concurrency and Computation-Practice & Experience","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/cpe.70174","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0
Abstract
Edge cameras are ubiquitous, together with the recent boom in computer vision technology, and a variety of video analytics tasks are being processed at the edge. It is challenging to support more complex video analytics tasks on edge servers with unpredictable request loads and limited resources. However, most of these works use only a single optimization approach, focusing only on the improvement of a certain performance metric in a single processing stage, ignoring the balance of other performance metrics, and the space available for optimization is often very limited. Especially when dealing with video analytics tasks that need to be divided into two GPU-CPU stages for completion, this unidirectional focus may lead to execution performance imbalance or even negative quality of service (QoS) optimization. In addition, to fully utilize the valuable resources on the edge servers, it is often necessary to schedule multiple types of video analytics tasks on the edge servers. However, most of the existing scheduling strategies only focus on how to allocate computational resources for end-to-end tasks. They lack the awareness and consideration of the execution of tasks in different execution stages, as well as the mutual interference among tasks. These scheduling strategies, lacking stage-sensitivity and interference-sensitivity, may cause performance conflicts in environments running multiple tasks involving GPU-CPU dual-stage processing, thus affecting the overall QoS. To address these challenges, we first evaluate the impact of batch processing, frame rate control, resolution selection, and CPU concurrency processing on throughput, latency, and accuracy when running dual-stage tasks on edge platforms. Then, we propose DualRT, a soft real-time video analytics framework for dual-stage tasks, to optimize the QoS of dual-stage tasks while avoiding request stacking on edge platforms. In the scheduling module of DualRT, we design a scheduling method using a multi-agent deep reinforcement learning algorithm and a variable time window approach to schedule multiple dual-stage tasks with joint control of batch size, resolution, frame rate, and CPU concurrency for each task. Our experimental results show that DualRT improves QoS by an average of 13.3% and maximum throughput by an average of 24.6% compared to state-of-the-art solutions.
期刊介绍:
Concurrency and Computation: Practice and Experience (CCPE) publishes high-quality, original research papers, and authoritative research review papers, in the overlapping fields of:
Parallel and distributed computing;
High-performance computing;
Computational and data science;
Artificial intelligence and machine learning;
Big data applications, algorithms, and systems;
Network science;
Ontologies and semantics;
Security and privacy;
Cloud/edge/fog computing;
Green computing; and
Quantum computing.