{"title":"On model-based clustering of video scenes using scenelets","authors":"Hong Lu, Yap-Peng Tan","doi":"10.1109/ICME.2002.1035778","DOIUrl":null,"url":null,"abstract":"We propose in this paper a model-based approach to clustering video scenes based on scenelets. We define a video scenelet as a short consecutive sample of frames of a video sequence. The approach makes use of an unsupervised method to represent scenelets of a video with a concise Gaussian mixture model and cluster them into different video scenes according to their visual similarities. In particular the expectation-maximization algorithm is employed to estimate the unknown model parameters, and Bayesian information criterion is used to determine the optimal number and model of scene clusters in a principled manner. This approach is fundamentally different from many existing video clustering methods, as it does not require explicit knowledge of shot boundaries. Instead, the shot boundaries can also be obtained as a by-product of the scene clustering process. The proposed methods have been tested with various types of sports videos and promising results are reported in this paper.","PeriodicalId":90694,"journal":{"name":"Proceedings. IEEE International Conference on Multimedia and Expo","volume":"20 1","pages":"301-304 vol.1"},"PeriodicalIF":0.0000,"publicationDate":"2002-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. IEEE International Conference on Multimedia and Expo","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICME.2002.1035778","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
We propose in this paper a model-based approach to clustering video scenes based on scenelets. We define a video scenelet as a short consecutive sample of frames of a video sequence. The approach makes use of an unsupervised method to represent scenelets of a video with a concise Gaussian mixture model and cluster them into different video scenes according to their visual similarities. In particular the expectation-maximization algorithm is employed to estimate the unknown model parameters, and Bayesian information criterion is used to determine the optimal number and model of scene clusters in a principled manner. This approach is fundamentally different from many existing video clustering methods, as it does not require explicit knowledge of shot boundaries. Instead, the shot boundaries can also be obtained as a by-product of the scene clustering process. The proposed methods have been tested with various types of sports videos and promising results are reported in this paper.