{"title":"Audiovisual video context recognition using SVM and genetic algorithm fusion rule weighting","authors":"Mikko Roininen, E. Guldogan, M. Gabbouj","doi":"10.1109/CBMI.2011.5972541","DOIUrl":null,"url":null,"abstract":"The recognition of the surrounding context from video recordings offers interesting possibilities for context awareness of video capable mobile devices. Multimodal analysis provides means for improved recognition accuracy and robustness in different use conditions. We present a mul-timodal video context recognition system fusing audio and video cues with support vector machines (SVM) and simple rules with genetic algorithm (GA) optimized weights. Mul-timodal recognition is shown to outperform the unimodal approaches in recognizing between 21 everyday contexts. The highest correct classification rate of 0.844 is achieved with SVM-based fusion.","PeriodicalId":358337,"journal":{"name":"2011 9th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"118 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 9th International Workshop on Content-Based Multimedia Indexing (CBMI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CBMI.2011.5972541","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
The recognition of the surrounding context from video recordings offers interesting possibilities for context awareness of video capable mobile devices. Multimodal analysis provides means for improved recognition accuracy and robustness in different use conditions. We present a mul-timodal video context recognition system fusing audio and video cues with support vector machines (SVM) and simple rules with genetic algorithm (GA) optimized weights. Mul-timodal recognition is shown to outperform the unimodal approaches in recognizing between 21 everyday contexts. The highest correct classification rate of 0.844 is achieved with SVM-based fusion.