{"title":"Interactive Event Recognition in Video","authors":"Mennan Güder, N. Cicekli","doi":"10.1109/ISM.2013.24","DOIUrl":null,"url":null,"abstract":"In this paper, we propose a multi-modal decision-level fusion framework to recognize events in videos. The main parts of the proposed framework are ontology based event definition, structural video decomposition, temporal rule discovery and event classification. Various decision sources such as audio continuity, content similarity, and shot sequence characteristics together with visual video feature sets are combined with event descriptors during decision-level fusion. The method is considered to be interactive because of the user directed ontology connection and temporal rule extraction strategies. It enables users to integrate available ontologies such as Image Net and Word Net while defining new event types. Temporal rules are discovered by association rule mining. In the proposed approach, computationally I/O intensive requirements of the association rule mining is reduced by one-pass frequent item set extractor and the proposed rule definition strategy. Accuracy of the proposed methodology is evaluated by employing TRECVid 2007 high level feature detection data set by comparing the results with C4.5 decision tree, SVM classifiers and Multiple Correspondence Analysis.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"57 1","pages":"100-101"},"PeriodicalIF":0.0000,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISM.2013.24","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In this paper, we propose a multi-modal decision-level fusion framework to recognize events in videos. The main parts of the proposed framework are ontology based event definition, structural video decomposition, temporal rule discovery and event classification. Various decision sources such as audio continuity, content similarity, and shot sequence characteristics together with visual video feature sets are combined with event descriptors during decision-level fusion. The method is considered to be interactive because of the user directed ontology connection and temporal rule extraction strategies. It enables users to integrate available ontologies such as Image Net and Word Net while defining new event types. Temporal rules are discovered by association rule mining. In the proposed approach, computationally I/O intensive requirements of the association rule mining is reduced by one-pass frequent item set extractor and the proposed rule definition strategy. Accuracy of the proposed methodology is evaluated by employing TRECVid 2007 high level feature detection data set by comparing the results with C4.5 decision tree, SVM classifiers and Multiple Correspondence Analysis.