Augmented transition networks as video browsing models for multimedia databases and multimedia information systems

Shu‐Ching Chen, S. Sista, M. Shyu, R. Kashyap
{"title":"Augmented transition networks as video browsing models for multimedia databases and multimedia information systems","authors":"Shu‐Ching Chen, S. Sista, M. Shyu, R. Kashyap","doi":"10.1109/TAI.1999.809783","DOIUrl":null,"url":null,"abstract":"In an interactive multimedia information system, users should have the flexibility to browse and choose various scenarios they want to see. This means that two-way communications should be captured by the conceptual model. Digital video has gained increasing popularity in many multimedia applications. Instead of sequential access to the video contents, the structuring and modeling of video data so that users can quickly and easily browse and retrieve interesting materials has become an important issue in designing multimedia information systems. An abstract semantic model called the augmented transition network (ATN), which can model video data and user interactions, is proposed in this paper. An ATN and its subnetworks can model video data based on different granularities, such as scenes, shots and key frames. Multimedia input strings are used as inputs for ATNs. The details of how to use multimedia input strings to model video data are also discussed. Key frame selection is based on the temporal and spatial relations of semantic objects in each shot. These relations are captured from our proposed unsupervised video segmentation method, which considers the problem of partitioning each frame as a joint estimation of the partition and class parameter variables. Unlike existing semantic models, which only model multimedia presentation, multimedia database searching or browsing, ATNs together with multimedia input strings can model these three in one framework.","PeriodicalId":194023,"journal":{"name":"Proceedings 11th International Conference on Tools with Artificial Intelligence","volume":"79 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1999-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"45","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings 11th International Conference on Tools with Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TAI.1999.809783","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 45

Abstract

In an interactive multimedia information system, users should have the flexibility to browse and choose various scenarios they want to see. This means that two-way communications should be captured by the conceptual model. Digital video has gained increasing popularity in many multimedia applications. Instead of sequential access to the video contents, the structuring and modeling of video data so that users can quickly and easily browse and retrieve interesting materials has become an important issue in designing multimedia information systems. An abstract semantic model called the augmented transition network (ATN), which can model video data and user interactions, is proposed in this paper. An ATN and its subnetworks can model video data based on different granularities, such as scenes, shots and key frames. Multimedia input strings are used as inputs for ATNs. The details of how to use multimedia input strings to model video data are also discussed. Key frame selection is based on the temporal and spatial relations of semantic objects in each shot. These relations are captured from our proposed unsupervised video segmentation method, which considers the problem of partitioning each frame as a joint estimation of the partition and class parameter variables. Unlike existing semantic models, which only model multimedia presentation, multimedia database searching or browsing, ATNs together with multimedia input strings can model these three in one framework.
增强转换网络作为多媒体数据库和多媒体信息系统的视频浏览模型
在交互式多媒体信息系统中,用户应该能够灵活地浏览和选择他们想要看到的各种场景。这意味着双向通信应该由概念模型捕获。数字视频在许多多媒体应用中越来越受欢迎。代替对视频内容的顺序访问,对视频数据进行结构化和建模,使用户能够快速方便地浏览和检索感兴趣的材料,已成为设计多媒体信息系统的一个重要问题。本文提出了一种抽象的语义模型——增强转换网络(ATN),它可以对视频数据和用户交互进行建模。ATN及其子网可以根据不同的粒度(如场景、镜头和关键帧)对视频数据进行建模。多媒体输入字符串用作atn的输入。还讨论了如何使用多媒体输入字符串对视频数据建模的细节。关键帧的选择是基于每个镜头中语义对象的时空关系。这些关系是从我们提出的无监督视频分割方法中捕获的,该方法将每帧的分割问题视为分割和类参数变量的联合估计。现有的语义模型只能对多媒体表示、多媒体数据库搜索或浏览进行建模,而atn结合多媒体输入字符串可以在一个框架中对这三种模型进行建模。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信