Sequence-based kernels for online concept detection in video

Automated Information Extraction in Media Production Pub Date : 2011-12-01 DOI:10.1145/2072552.2072554

W. Bailer

{"title":"Sequence-based kernels for online concept detection in video","authors":"W. Bailer","doi":"10.1145/2072552.2072554","DOIUrl":null,"url":null,"abstract":"Kernel methods, e.g. Support Vector Machines, have been successfully applied to classification problems such as concept detection in video. In order to capture concepts and events with longer temporal extent, kernels for sequences of feature vectors have been proposed, e.g. based on temporal pyramid matching or sequence alignment. However, all these approaches need a temporal segmentation of the video, as the kernel is applied to the feature vectors of a segment. In (semi-)supervised training, this is not a problem, as the ground truth is annotated on a temporal segment. When performing online concept detection on a live video stream, (i) no segmentation exists and (ii) the latency must be kept as low as possible. Re-evaluating the kernel for each temporal position of a sliding window is prohibitive due to the computational effort. We thus propose variants of the temporal pyramid matching, all subsequences and longest common subsequence kernels, which can be efficiently calculated for a temporal sliding window. An arbitrary kernel function can be plugged in to determine the similarity of feature vectors of individual samples. We evaluate the proposed kernels on the TRECVID 2007 High-level Feature Extraction data set and show that the sliding window variants for online detection perform equally well or better than the segment-based ones, while the runtime is reduced by at least 30%.","PeriodicalId":280321,"journal":{"name":"Automated Information Extraction in Media Production","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Automated Information Extraction in Media Production","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2072552.2072554","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

Abstract

Kernel methods, e.g. Support Vector Machines, have been successfully applied to classification problems such as concept detection in video. In order to capture concepts and events with longer temporal extent, kernels for sequences of feature vectors have been proposed, e.g. based on temporal pyramid matching or sequence alignment. However, all these approaches need a temporal segmentation of the video, as the kernel is applied to the feature vectors of a segment. In (semi-)supervised training, this is not a problem, as the ground truth is annotated on a temporal segment. When performing online concept detection on a live video stream, (i) no segmentation exists and (ii) the latency must be kept as low as possible. Re-evaluating the kernel for each temporal position of a sliding window is prohibitive due to the computational effort. We thus propose variants of the temporal pyramid matching, all subsequences and longest common subsequence kernels, which can be efficiently calculated for a temporal sliding window. An arbitrary kernel function can be plugged in to determine the similarity of feature vectors of individual samples. We evaluate the proposed kernels on the TRECVID 2007 High-level Feature Extraction data set and show that the sliding window variants for online detection perform equally well or better than the segment-based ones, while the runtime is reduced by at least 30%.

查看原文本刊更多论文

基于序列的视频在线概念检测方法

核方法，如支持向量机，已经成功地应用于分类问题，如视频中的概念检测。为了捕获具有较长时间范围的概念和事件，提出了基于时间金字塔匹配或序列对齐的特征向量序列核。然而，所有这些方法都需要对视频进行时间分割，因为内核是应用于片段的特征向量的。在(半)监督训练中，这不是问题，因为基本事实是在时间段上标注的。在实时视频流上执行在线概念检测时，(i)不存在分割，(ii)延迟必须保持尽可能低。由于计算工作量大，对滑动窗口的每个时间位置重新评估核是不允许的。因此，我们提出了时间金字塔匹配的变体，所有子序列和最长公共子序列核，可以有效地计算时间滑动窗口。可以插入任意的核函数来确定单个样本的特征向量的相似性。我们在TRECVID 2007高级特征提取数据集上对所提出的内核进行了评估，结果表明，用于在线检测的滑动窗口变体与基于段的变体一样好，甚至更好，而运行时间至少减少了30%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Automated Information Extraction in Media Production

自引率

0.00%

发文量