{"title":"Conspicuity-based visual scene semantic similarity computing for video","authors":"Wei Wei, Tianyun Yan, Yuan-Mao Zhang","doi":"10.1109/ICMLC.2010.5580490","DOIUrl":null,"url":null,"abstract":"Based on saliency region representation of visual scene, a framework for quantifying the semantic similarity of two video scenes is proposed in this paper. Frame-segment key-frame strategy is used to concisely represent video content in temporal domain. Spatio-temporal conspicuity model for basic visual semantics, a neuromorphic model that simulates human visual system, is used to select dynamic and static spatial salient areas. With pattern classification technique, the basic visual semantics are recognized. Then, the similarity of two visual scenes is calculated according to information theoretic similarity principles and Tversky's set-theoretic similarity. Experiment results demonstrate the framework could compute quantitative semantic similarity of two video scenes.","PeriodicalId":126080,"journal":{"name":"2010 International Conference on Machine Learning and Cybernetics","volume":"47 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 International Conference on Machine Learning and Cybernetics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLC.2010.5580490","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Based on saliency region representation of visual scene, a framework for quantifying the semantic similarity of two video scenes is proposed in this paper. Frame-segment key-frame strategy is used to concisely represent video content in temporal domain. Spatio-temporal conspicuity model for basic visual semantics, a neuromorphic model that simulates human visual system, is used to select dynamic and static spatial salient areas. With pattern classification technique, the basic visual semantics are recognized. Then, the similarity of two visual scenes is calculated according to information theoretic similarity principles and Tversky's set-theoretic similarity. Experiment results demonstrate the framework could compute quantitative semantic similarity of two video scenes.