Retina-U: A Two-Level Real-Time Analytics Framework for UHD Live Video Streaming

IF 3.2 1区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Transactions on Broadcasting Pub Date : 2024-01-10 DOI:10.1109/TBC.2023.3345646

Wei Zhang;Yunpeng Jing;Yuan Zhang;Tao Lin;Jinyao Yan

{"title":"Retina-U: A Two-Level Real-Time Analytics Framework for UHD Live Video Streaming","authors":"Wei Zhang;Yunpeng Jing;Yuan Zhang;Tao Lin;Jinyao Yan","doi":"10.1109/TBC.2023.3345646","DOIUrl":null,"url":null,"abstract":"UHD live video streaming, with its high video resolution, offers a wealth of fine-grained scene details, presenting opportunities for intricate video analytics. However, current real-time video streaming analytics solutions are inadequate in analyzing these detailed features, often leading to low accuracy in the analysis of small objects with fine details. Furthermore, due to the high bitrate and precision of UHD streaming, existing real-time inference frameworks typically suffer from low analyzed frame rate caused by the significant computational cost involved. To meet the accuracy requirement and improve the analyzed frame rate, we introduce Retina-U, a real-time analytics framework for UHD video streaming. Specifically, we first present SECT, a real-time DNN model level inference model to enhance inference accuracy in dynamic UHD streaming with an abundance of small objects. SECT uses a slicing-based enhanced inference (SEI) method and Cascade Sparse Queries (CSQ) based-fine tuning to improve the accuracy, and leverages a lightweight tracker to achieve high analyzed frame rate. At the system level, to further improve the inference accuracy and bolster the analyzed frame rate, we propose a deep reinforcement learning-based resource management algorithm for real-time joint network adaptation, resource allocation, and server selection. By simultaneously considering the network and computational resources, we can maximize the comprehensive analytic performance in a dynamic and complex environment. Experimental results demonstrate the effectiveness of Retina-U, showcasing improvements in accuracy of up to 38.01% and inference speed acceleration of up to 24.33%.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 2","pages":"429-440"},"PeriodicalIF":3.2000,"publicationDate":"2024-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Broadcasting","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10387718/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

Abstract

UHD live video streaming, with its high video resolution, offers a wealth of fine-grained scene details, presenting opportunities for intricate video analytics. However, current real-time video streaming analytics solutions are inadequate in analyzing these detailed features, often leading to low accuracy in the analysis of small objects with fine details. Furthermore, due to the high bitrate and precision of UHD streaming, existing real-time inference frameworks typically suffer from low analyzed frame rate caused by the significant computational cost involved. To meet the accuracy requirement and improve the analyzed frame rate, we introduce Retina-U, a real-time analytics framework for UHD video streaming. Specifically, we first present SECT, a real-time DNN model level inference model to enhance inference accuracy in dynamic UHD streaming with an abundance of small objects. SECT uses a slicing-based enhanced inference (SEI) method and Cascade Sparse Queries (CSQ) based-fine tuning to improve the accuracy, and leverages a lightweight tracker to achieve high analyzed frame rate. At the system level, to further improve the inference accuracy and bolster the analyzed frame rate, we propose a deep reinforcement learning-based resource management algorithm for real-time joint network adaptation, resource allocation, and server selection. By simultaneously considering the network and computational resources, we can maximize the comprehensive analytic performance in a dynamic and complex environment. Experimental results demonstrate the effectiveness of Retina-U, showcasing improvements in accuracy of up to 38.01% and inference speed acceleration of up to 24.33%.

查看原文本刊更多论文

Retina-U：用于超高清实时视频流的两级实时分析框架

超高清实时视频流具有很高的视频分辨率，提供了大量精细的场景细节，为复杂的视频分析提供了机会。然而，当前的实时视频流分析解决方案在分析这些细节特征方面存在不足，往往导致对具有精细细节的小物体的分析精度较低。此外，由于 UHD 流媒体的码率高、精度高，现有的实时推理框架通常会因计算成本过高而导致分析帧率过低。为了满足精度要求并提高分析帧率，我们推出了用于 UHD 视频流的实时分析框架 Retina-U。具体来说，我们首先介绍了 SECT，这是一种实时 DNN 模型级推理模型，用于提高具有大量小物体的动态 UHD 流媒体的推理精度。SECT 采用基于切片的增强推理（SEI）方法和基于级联稀疏查询（CSQ）的微调来提高推理的准确性，并利用轻量级跟踪器来实现较高的分析帧频。在系统层面，为了进一步提高推理精度和分析帧率，我们提出了一种基于深度强化学习的资源管理算法，用于实时联合网络适应、资源分配和服务器选择。通过同时考虑网络和计算资源，我们可以在动态复杂的环境中最大限度地提高综合分析性能。实验结果证明了 Retina-U 的有效性，其准确率提高了 38.01%，推理速度加快了 24.33%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Broadcasting 工程技术-电信学

CiteScore

9.40

自引率

31.10%

发文量

审稿时长

6-12 weeks

期刊介绍： The Society’s Field of Interest is “Devices, equipment, techniques and systems related to broadcast technology, including the production, distribution, transmission, and propagation aspects.” In addition to this formal FOI statement, which is used to provide guidance to the Publications Committee in the selection of content, the AdCom has further resolved that “broadcast systems includes all aspects of transmission, propagation, and reception.”