{"title":"Augmented transition networks as video browsing models for multimedia databases and multimedia information systems","authors":"Shu‐Ching Chen, S. Sista, M. Shyu, R. Kashyap","doi":"10.1109/TAI.1999.809783","DOIUrl":null,"url":null,"abstract":"In an interactive multimedia information system, users should have the flexibility to browse and choose various scenarios they want to see. This means that two-way communications should be captured by the conceptual model. Digital video has gained increasing popularity in many multimedia applications. Instead of sequential access to the video contents, the structuring and modeling of video data so that users can quickly and easily browse and retrieve interesting materials has become an important issue in designing multimedia information systems. An abstract semantic model called the augmented transition network (ATN), which can model video data and user interactions, is proposed in this paper. An ATN and its subnetworks can model video data based on different granularities, such as scenes, shots and key frames. Multimedia input strings are used as inputs for ATNs. The details of how to use multimedia input strings to model video data are also discussed. Key frame selection is based on the temporal and spatial relations of semantic objects in each shot. These relations are captured from our proposed unsupervised video segmentation method, which considers the problem of partitioning each frame as a joint estimation of the partition and class parameter variables. Unlike existing semantic models, which only model multimedia presentation, multimedia database searching or browsing, ATNs together with multimedia input strings can model these three in one framework.","PeriodicalId":194023,"journal":{"name":"Proceedings 11th International Conference on Tools with Artificial Intelligence","volume":"79 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1999-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"45","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings 11th International Conference on Tools with Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TAI.1999.809783","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 45
Abstract
In an interactive multimedia information system, users should have the flexibility to browse and choose various scenarios they want to see. This means that two-way communications should be captured by the conceptual model. Digital video has gained increasing popularity in many multimedia applications. Instead of sequential access to the video contents, the structuring and modeling of video data so that users can quickly and easily browse and retrieve interesting materials has become an important issue in designing multimedia information systems. An abstract semantic model called the augmented transition network (ATN), which can model video data and user interactions, is proposed in this paper. An ATN and its subnetworks can model video data based on different granularities, such as scenes, shots and key frames. Multimedia input strings are used as inputs for ATNs. The details of how to use multimedia input strings to model video data are also discussed. Key frame selection is based on the temporal and spatial relations of semantic objects in each shot. These relations are captured from our proposed unsupervised video segmentation method, which considers the problem of partitioning each frame as a joint estimation of the partition and class parameter variables. Unlike existing semantic models, which only model multimedia presentation, multimedia database searching or browsing, ATNs together with multimedia input strings can model these three in one framework.