Self-Enhancing Video Data Management System for Compositional Events with Large Language Models [Technical Report]

Enhao Zhang, Nicole Sullivan, Brandon Haynes, Ranjay Krishna, Magdalena Balazinska
{"title":"Self-Enhancing Video Data Management System for Compositional Events with Large Language Models [Technical Report]","authors":"Enhao Zhang, Nicole Sullivan, Brandon Haynes, Ranjay Krishna, Magdalena Balazinska","doi":"arxiv-2408.02243","DOIUrl":null,"url":null,"abstract":"Complex video queries can be answered by decomposing them into modular\nsubtasks. However, existing video data management systems assume the existence\nof predefined modules for each subtask. We introduce VOCAL-UDF, a novel\nself-enhancing system that supports compositional queries over videos without\nthe need for predefined modules. VOCAL-UDF automatically identifies and\nconstructs missing modules and encapsulates them as user-defined functions\n(UDFs), thus expanding its querying capabilities. To achieve this, we formulate\na unified UDF model that leverages large language models (LLMs) to aid in new\nUDF generation. VOCAL-UDF handles a wide range of concepts by supporting both\nprogram-based UDFs (i.e., Python functions generated by LLMs) and\ndistilled-model UDFs (lightweight vision models distilled from strong\npretrained models). To resolve the inherent ambiguity in user intent, VOCAL-UDF\ngenerates multiple candidate UDFs and uses active learning to efficiently\nselect the best one. With the self-enhancing capability, VOCAL-UDF\nsignificantly improves query performance across three video datasets.","PeriodicalId":501123,"journal":{"name":"arXiv - CS - Databases","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Databases","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.02243","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Complex video queries can be answered by decomposing them into modular subtasks. However, existing video data management systems assume the existence of predefined modules for each subtask. We introduce VOCAL-UDF, a novel self-enhancing system that supports compositional queries over videos without the need for predefined modules. VOCAL-UDF automatically identifies and constructs missing modules and encapsulates them as user-defined functions (UDFs), thus expanding its querying capabilities. To achieve this, we formulate a unified UDF model that leverages large language models (LLMs) to aid in new UDF generation. VOCAL-UDF handles a wide range of concepts by supporting both program-based UDFs (i.e., Python functions generated by LLMs) and distilled-model UDFs (lightweight vision models distilled from strong pretrained models). To resolve the inherent ambiguity in user intent, VOCAL-UDF generates multiple candidate UDFs and uses active learning to efficiently select the best one. With the self-enhancing capability, VOCAL-UDF significantly improves query performance across three video datasets.
采用大型语言模型的合成事件自增强视频数据管理系统 [技术报告]
复杂的视频查询可通过将其分解为模块化子任务来回答。然而,现有的视频数据管理系统假设每个子任务都存在预定义的模块。我们介绍了 VOCAL-UDF,它是一种新颖的自我增强系统,无需预定义模块即可支持视频组合查询。VOCAL-UDF 可自动识别和构建缺失的模块,并将其封装为用户自定义函数(UDF),从而扩展其查询功能。为此,我们建立了一个统一的 UDF 模型,利用大型语言模型(LLM)来帮助生成新的 UDF。VOCAL-UDF 支持基于程序的 UDF(即由 LLM 生成的 Python 函数)和经蒸馏的模型 UDF(从强预处理模型中蒸馏出的轻量级视觉模型),可以处理各种概念。为了解决用户意图中固有的模糊性,VOCAL-UDF 生成多个候选 UDF,并利用主动学习有效地选择最佳 UDF。凭借自我增强能力,VOCAL-UDF 显著提高了三个视频数据集的查询性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信