Elixir: A System to Enhance Data Quality for Multiple Analytics on a Video Stream

2023 IEEE International Conference on Smart Computing (SMARTCOMP) Pub Date : 2022-12-08 DOI:10.1109/SMARTCOMP58114.2023.00030

Sibendu Paul, Kunal Rao, G. Coviello, Murugan Sankaradas, Oliver Po, Y. C. Hu, S. Chakradhar

{"title":"Elixir: A System to Enhance Data Quality for Multiple Analytics on a Video Stream","authors":"Sibendu Paul, Kunal Rao, G. Coviello, Murugan Sankaradas, Oliver Po, Y. C. Hu, S. Chakradhar","doi":"10.1109/SMARTCOMP58114.2023.00030","DOIUrl":null,"url":null,"abstract":"IoT sensors, especially video cameras, are ubiquitously deployed around the world to perform a variety of computer vision tasks in several verticals including retail, health-care, safety and security, transportation, manufacturing, etc. To amortize their high deployment effort and cost, it is desirable to perform multiple video analytics tasks, which we refer to as Analytical Units (AUs), off the video feed coming out of every camera. As AUs typically use deep-learning based AI/ML models, their performances depend on the quality of the input video. The most recent work has shown that dynamically adjusting the camera setting exposed by popular network cameras can help improve the quality of the video feed and hence the AU accuracy, in a single AU setting. In this paper, we first show that in a multi-AU setting, changing the camera setting has disproportionate impact on different AUs performance. In particular, the optimal setting for one AU may severely degrade the performance for another AU, and further, the impact on different AUs varies as the environmental condition changes. We then present Elixir, a system to enhance the video stream quality for multiple analytics on a video stream. Elixir leverages Multi-Objective Reinforcement Learning (MORL), where the RL agent caters to the objectives from different AUs and adjusts the camera setting to simultaneously enhance the performance of all AUs. To define the multiple objectives in MORL, we develop new AU-specific quality estimator values for each individual AU. We evaluate Elixir through real-world experiments on a testbed with three cameras deployed next to each other (overlooking a large enterprise parking lot) running Elixir and two baseline approaches, respectively. Elixir correctly detects 7.1% (22,068) and 5.0% (15,731) more cars, 94% (551) and 72% (478) more faces, and 670.4% (4975) and 158.6% (3507) more persons than the default-setting and time-sharing approaches, respectively. It also detects 115 license plates, far more than the time-sharing approach (7) and the default setting (0).","PeriodicalId":163556,"journal":{"name":"2023 IEEE International Conference on Smart Computing (SMARTCOMP)","volume":"77 5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE International Conference on Smart Computing (SMARTCOMP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SMARTCOMP58114.2023.00030","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

IoT sensors, especially video cameras, are ubiquitously deployed around the world to perform a variety of computer vision tasks in several verticals including retail, health-care, safety and security, transportation, manufacturing, etc. To amortize their high deployment effort and cost, it is desirable to perform multiple video analytics tasks, which we refer to as Analytical Units (AUs), off the video feed coming out of every camera. As AUs typically use deep-learning based AI/ML models, their performances depend on the quality of the input video. The most recent work has shown that dynamically adjusting the camera setting exposed by popular network cameras can help improve the quality of the video feed and hence the AU accuracy, in a single AU setting. In this paper, we first show that in a multi-AU setting, changing the camera setting has disproportionate impact on different AUs performance. In particular, the optimal setting for one AU may severely degrade the performance for another AU, and further, the impact on different AUs varies as the environmental condition changes. We then present Elixir, a system to enhance the video stream quality for multiple analytics on a video stream. Elixir leverages Multi-Objective Reinforcement Learning (MORL), where the RL agent caters to the objectives from different AUs and adjusts the camera setting to simultaneously enhance the performance of all AUs. To define the multiple objectives in MORL, we develop new AU-specific quality estimator values for each individual AU. We evaluate Elixir through real-world experiments on a testbed with three cameras deployed next to each other (overlooking a large enterprise parking lot) running Elixir and two baseline approaches, respectively. Elixir correctly detects 7.1% (22,068) and 5.0% (15,731) more cars, 94% (551) and 72% (478) more faces, and 670.4% (4975) and 158.6% (3507) more persons than the default-setting and time-sharing approaches, respectively. It also detects 115 license plates, far more than the time-sharing approach (7) and the default setting (0).

查看原文本刊更多论文

Elixir:一个提高视频流多重分析数据质量的系统

物联网传感器，特别是摄像机，在世界各地无处不在地部署，在零售、医疗保健、安全和安保、运输、制造等多个垂直领域执行各种计算机视觉任务。为了分摊它们的高部署工作和成本，执行多个视频分析任务是可取的，我们将其称为分析单元(au)，从每个摄像机发出的视频馈送中取出。由于人工智能通常使用基于深度学习的AI/ML模型，它们的性能取决于输入视频的质量。最近的研究表明，动态调整流行的网络摄像机曝光的摄像机设置可以帮助提高视频馈送的质量，从而提高单一AU设置下的AU精度。在本文中，我们首先证明了在多au设置下，改变相机设置对不同au性能的影响不成比例。特别是，一个AU的最佳设置可能会严重降低另一个AU的性能，并且对不同AU的影响会随着环境条件的变化而变化。然后，我们介绍了Elixir，一个系统，以提高视频流质量的多个分析视频流。Elixir利用多目标强化学习(MORL)，其中RL代理迎合来自不同au的目标并调整相机设置以同时增强所有au的性能。为了定义MORL中的多个目标，我们为每个单独的AU开发了新的特定于AU的质量估计值。我们通过在一个测试平台上的真实世界实验来评估Elixir，该测试平台上有三个相邻部署的摄像机(俯瞰大型企业停车场)，分别运行Elixir和两个基线方法。与默认设置方法和分时方法相比，Elixir分别正确检测出7.1%(22,068)和5.0%(15,731)的汽车，94%(551)和72%(478)的面孔，以及670.4%(4975)和158.6%(3507)的人。它还能检测到115个车牌，远远超过分时方法(7个)和默认设置(0个)。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2023 IEEE International Conference on Smart Computing (SMARTCOMP)

自引率

0.00%

发文量