Extract-Transform-Load for Video Streams

Ferdinand Kossmann, Ziniu Wu, Eugenie Lai, Nesime Tatbul, Lei Cao, Tim Kraska, S. Madden
{"title":"Extract-Transform-Load for Video Streams","authors":"Ferdinand Kossmann, Ziniu Wu, Eugenie Lai, Nesime Tatbul, Lei Cao, Tim Kraska, S. Madden","doi":"10.14778/3598581.3598600","DOIUrl":null,"url":null,"abstract":"\n Social media, self-driving cars, and traffic cameras produce video streams at large scales and cheap cost. However, storing and querying video at such scales is prohibitively expensive. We propose to treat large-scale video analytics as a data warehousing problem: Video is a format that is easy to produce but needs to be transformed into an application-specific format that is easy to query. Analogously, we define the problem of Video Extract-Transform-Load (\n V-ETL\n ).\n V-ETL\n systems need to reduce the cost of running a user-defined\n V-ETL\n job while also giving throughput guarantees to keep up with the rate at which data is produced. We find that no current system sufficiently fulfills both needs and therefore propose\n Skyscraper\n , a system tailored to\n V-ETL. Skyscraper\n can execute arbitrary video ingestion pipelines and adaptively tunes them to reduce cost at minimal or no quality degradation, e.g., by adjusting sampling rates and resolutions to the ingested content.\n Skyscraper\n can hereby be provisioned with cheap on-premises compute and uses a combination of buffering and cloud bursting to deal with peaks in workload caused by expensive processing configurations. In our experiments, we find that\n Skyscraper\n significantly reduces the cost of\n V-ETL\n ingestion compared to adaptions of current SOTA systems, while at the same time giving robustness guarantees that these systems are lacking.\n","PeriodicalId":20467,"journal":{"name":"Proc. VLDB Endow.","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proc. VLDB Endow.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.14778/3598581.3598600","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Social media, self-driving cars, and traffic cameras produce video streams at large scales and cheap cost. However, storing and querying video at such scales is prohibitively expensive. We propose to treat large-scale video analytics as a data warehousing problem: Video is a format that is easy to produce but needs to be transformed into an application-specific format that is easy to query. Analogously, we define the problem of Video Extract-Transform-Load ( V-ETL ). V-ETL systems need to reduce the cost of running a user-defined V-ETL job while also giving throughput guarantees to keep up with the rate at which data is produced. We find that no current system sufficiently fulfills both needs and therefore propose Skyscraper , a system tailored to V-ETL. Skyscraper can execute arbitrary video ingestion pipelines and adaptively tunes them to reduce cost at minimal or no quality degradation, e.g., by adjusting sampling rates and resolutions to the ingested content. Skyscraper can hereby be provisioned with cheap on-premises compute and uses a combination of buffering and cloud bursting to deal with peaks in workload caused by expensive processing configurations. In our experiments, we find that Skyscraper significantly reduces the cost of V-ETL ingestion compared to adaptions of current SOTA systems, while at the same time giving robustness guarantees that these systems are lacking.
提取-转换-加载视频流
社交媒体、自动驾驶汽车和交通摄像头可以大规模、低成本地生产视频流。然而,如此大规模的存储和查询视频是非常昂贵的。我们建议将大规模视频分析视为数据仓库问题:视频是一种易于生成的格式,但需要转换为易于查询的特定于应用程序的格式。类似地,我们定义了视频提取-转换-加载(V-ETL)问题。V-ETL系统需要降低运行用户定义的V-ETL作业的成本,同时还要提供吞吐量保证,以跟上数据生成的速度。我们发现目前没有系统能够充分满足这两种需求,因此提出了摩天大楼,这是一个为V-ETL量身定制的系统。摩天大楼可以执行任意的视频摄取管道,并自适应地调整它们,以最小化或没有质量下降来降低成本,例如,通过调整摄取内容的采样率和分辨率。因此,摩天楼可以配备廉价的本地计算,并结合使用缓冲和云爆发来处理由昂贵的处理配置引起的工作负载高峰。在我们的实验中,我们发现与当前SOTA系统的适应相比,Skyscraper显著降低了V-ETL摄取的成本,同时提供了这些系统所缺乏的鲁棒性保证。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信