Video skimming and characterization through the combination of image and language understanding

Proceedings 1998 IEEE International Workshop on Content-Based Access of Image and Video Database Pub Date : 1998-01-03 DOI:10.1109/CAIVD.1998.646034

Michael A. Smith, T. Kanade

引用次数: 202

Abstract

Digital video is rapidly becoming important for education, entertainment and a host of multimedia applications. With the size of the video collections growing to thousands of hours, technology is needed to effectively browse segments in a short time without losing the content of the video. We propose a method to extract the significant audio and video information and create a skim video which represents a very short synopsis of the original. The goal of this work is to show the utility of integrating language and image understanding techniques for video skimming by extraction of significant information, such as specific objects, audio keywords and relevant video structure. The resulting skim video is much shorter; where compaction is as high as 20:1, and yet retains the essential content of the original segment. We have conducted a user-study to test the content summarization and effectiveness of the skim as a browsing tool.

查看原文本刊更多论文

通过图像和语言理解相结合的视频浏览和表征

数字视频正迅速成为教育、娱乐和大量多媒体应用的重要工具。随着视频集合的规模增长到数千小时，需要技术在短时间内有效地浏览片段而不丢失视频的内容。我们提出了一种方法来提取重要的音频和视频信息，并创建一个略读视频，它代表了一个非常简短的原始摘要。这项工作的目标是通过提取重要信息(如特定对象、音频关键字和相关视频结构)来展示集成语言和图像理解技术在视频浏览中的实用性。由此产生的略读视频要短得多;其中压实率高达20:1，但保留了原始段的基本内容。我们进行了一项用户研究，以测试内容摘要和略读作为浏览工具的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings 1998 IEEE International Workshop on Content-Based Access of Image and Video Database

自引率

0.00%

发文量