{"title":"Introducing motion information in dense feature classifiers","authors":"Claudiu Tanase, B. Mérialdo","doi":"10.1109/WIAMIS.2013.6616132","DOIUrl":null,"url":null,"abstract":"Semantic concept detection in large scale video collections is mostly achieved through a static analysis of selected keyframes. A popular choice for representing the visual content of an image is based on the pooling of local descriptors such as Dense SIFT. However, simple motion features such as optic flow can be extracted relatively easy from such keyframes. In this paper we propose an efficient addition to the DSIFT approach by including information derived from optic flow. Based on optic flow magnitude, we can estimate for each DSIFT patch whether it is static or moving. We modify the bag of words model used traditionally with DSIFT by creating two separate occurrence histograms instead of one: one for static patches and one for dynamic patches. We further refine this method by studying different separation thresholds and soft assign-ment, as well as different normalization techniques. Classifier score fusion is used to maximize the average precision of all these variants. Experimental results on the TRECVID Semantic Indexing collection show that by means of classifier fusion our method increases overall mean average precision of the DSIFT classifier from 0.061 to 0.106.","PeriodicalId":408077,"journal":{"name":"2013 14th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS)","volume":"62 11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 14th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WIAMIS.2013.6616132","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Semantic concept detection in large scale video collections is mostly achieved through a static analysis of selected keyframes. A popular choice for representing the visual content of an image is based on the pooling of local descriptors such as Dense SIFT. However, simple motion features such as optic flow can be extracted relatively easy from such keyframes. In this paper we propose an efficient addition to the DSIFT approach by including information derived from optic flow. Based on optic flow magnitude, we can estimate for each DSIFT patch whether it is static or moving. We modify the bag of words model used traditionally with DSIFT by creating two separate occurrence histograms instead of one: one for static patches and one for dynamic patches. We further refine this method by studying different separation thresholds and soft assign-ment, as well as different normalization techniques. Classifier score fusion is used to maximize the average precision of all these variants. Experimental results on the TRECVID Semantic Indexing collection show that by means of classifier fusion our method increases overall mean average precision of the DSIFT classifier from 0.061 to 0.106.