MOSAIC: Spatially-Multiplexed Edge AI Optimization over Multiple Concurrent Video Sensing Streams

Ila Gokarn, H. Sabbella, Yigong Hu, T. Abdelzaher, Archan Misra
{"title":"MOSAIC: Spatially-Multiplexed Edge AI Optimization over Multiple Concurrent Video Sensing Streams","authors":"Ila Gokarn, H. Sabbella, Yigong Hu, T. Abdelzaher, Archan Misra","doi":"10.1145/3587819.3590986","DOIUrl":null,"url":null,"abstract":"Sustaining high fidelity and high throughput of perception tasks over vision sensor streams on edge devices remains a formidable challenge, especially given the continuing increase in image sizes (e.g., generated by 4K cameras) and complexity of DNN models. One promising approach involves criticality-aware processing, where the computation is directed selectively to \"critical\" portions of individual image frames. We introduce MOSAIC, a novel system for such criticality-aware concurrent processing of multiple vision sensing streams that provides a multiplicative increase in the achievable throughput with negligible loss in perception fidelity. MOSAIC determines critical regions from images received from multiple vision sensors and spatially bin-packs these regions using a novel multi-scale Mosaic Across Scales (MoS) tiling strategy into a single `canvas frame', sized such that the edge device can retain sufficiently high processing throughput. Experimental studies using benchmark datasets for two tasks, Automatic License Plate Recognition and Drone-based Pedestrian Detection, shows that MOSAIC, executing on a Jetson TX2 edge device, can provide dramatic gains in the throughput vs. fidelity tradeoff. For instance, for drone-based pedestrian detection, for a batch size of 4, MOSAIC can pack input frames from 6 cameras to achieve (a) 4.75X (475%) higher throughput (23 FPS per camera, cumulatively 138FPS) with ≤ 1% accuracy loss, compared to a First Come First Serve (FCFS) processing paradigm.","PeriodicalId":330983,"journal":{"name":"Proceedings of the 14th Conference on ACM Multimedia Systems","volume":"77 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 14th Conference on ACM Multimedia Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3587819.3590986","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Sustaining high fidelity and high throughput of perception tasks over vision sensor streams on edge devices remains a formidable challenge, especially given the continuing increase in image sizes (e.g., generated by 4K cameras) and complexity of DNN models. One promising approach involves criticality-aware processing, where the computation is directed selectively to "critical" portions of individual image frames. We introduce MOSAIC, a novel system for such criticality-aware concurrent processing of multiple vision sensing streams that provides a multiplicative increase in the achievable throughput with negligible loss in perception fidelity. MOSAIC determines critical regions from images received from multiple vision sensors and spatially bin-packs these regions using a novel multi-scale Mosaic Across Scales (MoS) tiling strategy into a single `canvas frame', sized such that the edge device can retain sufficiently high processing throughput. Experimental studies using benchmark datasets for two tasks, Automatic License Plate Recognition and Drone-based Pedestrian Detection, shows that MOSAIC, executing on a Jetson TX2 edge device, can provide dramatic gains in the throughput vs. fidelity tradeoff. For instance, for drone-based pedestrian detection, for a batch size of 4, MOSAIC can pack input frames from 6 cameras to achieve (a) 4.75X (475%) higher throughput (23 FPS per camera, cumulatively 138FPS) with ≤ 1% accuracy loss, compared to a First Come First Serve (FCFS) processing paradigm.
MOSAIC:在多个并发视频传感流上的空间多路边缘AI优化
在边缘设备上对视觉传感器流保持高保真度和高吞吐量的感知任务仍然是一个艰巨的挑战,特别是考虑到图像尺寸的持续增加(例如,由4K摄像机生成)和DNN模型的复杂性。一种有前途的方法涉及临界感知处理,其中计算被选择性地定向到单个图像帧的“关键”部分。我们介绍MOSAIC,这是一种用于多视觉传感流的临界感知并发处理的新系统,它提供了可实现吞吐量的成倍增加,而感知保真度的损失可以忽略不计。MOSAIC从从多个视觉传感器接收的图像中确定关键区域,并使用新颖的多尺度跨尺度马赛克(MoS)平铺策略将这些区域在空间上打包成单个“画布框架”,大小使边缘设备可以保持足够高的处理吞吐量。使用基准数据集对自动车牌识别和基于无人机的行人检测两项任务进行的实验研究表明,在Jetson TX2边缘设备上执行的MOSAIC可以在吞吐量与保真度之间取得显着的收益。例如,对于基于无人机的行人检测,与先到先得(FCFS)处理范例相比,对于批量大小为4的批次,MOSAIC可以打包来自6个摄像机的输入帧,从而实现4.75倍(475%)的高吞吐量(每个摄像机23帧,累计138帧),精度损失≤1%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信