{"title":"基于视听和脑电图的情感视频内容提取注意建模","authors":"I. Mehmood, M. Sajjad, S. Baik, Seungmin Rho","doi":"10.1109/PLATCON.2015.13","DOIUrl":null,"url":null,"abstract":"Video summarization is a procedure to reduce redundancy and generate concise representation of the video data. Extracting affective key frames from video sequences is an enthusiastic approach amongst video summarization schemes. Affective key frames refer to the intensity and type of feelings that are contained in video and expected to arise in spectators mind. Recent summarization schemes consider audio and visual information. However, these data modalities are not sufficient to accurately perceive human attention, failing to extract semantically relevant content. Video content incites strong neural responses in users, which can be measured by analyzing electroencephalography (EEG) brain signals. Merging EEG and multimedia analysis can serve as a bridge, linking the digital representation of multimedia and user perception. In this context, we propose an affective video content extraction scheme that integrates human neuronal signals with audio-visual features for better perception and comprehension of digital videos. Experimental results shown that the propose model can accurately reflect user preferences, and facilitate extraction of highly affective and personalized summaries.","PeriodicalId":220038,"journal":{"name":"2015 International Conference on Platform Technology and Service","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Audio-Visual and EEG-Based Attention Modeling for Extraction of Affective Video Content\",\"authors\":\"I. Mehmood, M. Sajjad, S. Baik, Seungmin Rho\",\"doi\":\"10.1109/PLATCON.2015.13\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Video summarization is a procedure to reduce redundancy and generate concise representation of the video data. Extracting affective key frames from video sequences is an enthusiastic approach amongst video summarization schemes. Affective key frames refer to the intensity and type of feelings that are contained in video and expected to arise in spectators mind. Recent summarization schemes consider audio and visual information. However, these data modalities are not sufficient to accurately perceive human attention, failing to extract semantically relevant content. Video content incites strong neural responses in users, which can be measured by analyzing electroencephalography (EEG) brain signals. Merging EEG and multimedia analysis can serve as a bridge, linking the digital representation of multimedia and user perception. In this context, we propose an affective video content extraction scheme that integrates human neuronal signals with audio-visual features for better perception and comprehension of digital videos. Experimental results shown that the propose model can accurately reflect user preferences, and facilitate extraction of highly affective and personalized summaries.\",\"PeriodicalId\":220038,\"journal\":{\"name\":\"2015 International Conference on Platform Technology and Service\",\"volume\":\"42 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-01-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 International Conference on Platform Technology and Service\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/PLATCON.2015.13\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 International Conference on Platform Technology and Service","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PLATCON.2015.13","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Audio-Visual and EEG-Based Attention Modeling for Extraction of Affective Video Content
Video summarization is a procedure to reduce redundancy and generate concise representation of the video data. Extracting affective key frames from video sequences is an enthusiastic approach amongst video summarization schemes. Affective key frames refer to the intensity and type of feelings that are contained in video and expected to arise in spectators mind. Recent summarization schemes consider audio and visual information. However, these data modalities are not sufficient to accurately perceive human attention, failing to extract semantically relevant content. Video content incites strong neural responses in users, which can be measured by analyzing electroencephalography (EEG) brain signals. Merging EEG and multimedia analysis can serve as a bridge, linking the digital representation of multimedia and user perception. In this context, we propose an affective video content extraction scheme that integrates human neuronal signals with audio-visual features for better perception and comprehension of digital videos. Experimental results shown that the propose model can accurately reflect user preferences, and facilitate extraction of highly affective and personalized summaries.